Python Pandas : min, max (컬럼간의 값 비교하기)

Notice

Recent Posts

Recent Comments

Link

« 2024/11 »
일	월	화	수	목	금	토
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30

Tags more

Archives

Today

Total

관리 메뉴

달나라 노트

Python Pandas : min, max (컬럼간의 값 비교하기) 본문

Python/Python Pandas

Python Pandas : min, max (컬럼간의 값 비교하기)

CosmosProject 2021. 7. 2. 19:22

728x90

min, max method를 이용하면 컬럼간의 값 비교가 가능해집니다.

import pandas as pd
import numpy as np

dict_test = {
    'col1': [1, 2, np.nan, 4, 5],
    'col2': [6, 2, 7, 3, 9],
    'col3': [10, 6, 22, np.nan, 21]
}

df_test = pd.DataFrame(dict_test)
print(df_test)


-- Result
   col1  col2  col3
0   1.0     6  10.0
1   2.0     2   6.0
2   NaN     7  22.0
3   4.0     3   NaN
4   5.0     9  21.0

먼저 test용 DataFrame을 위처럼 생성합시다.

import pandas as pd
import numpy as np

dict_test = {
    'col1': [1, 2, np.nan, 4, 5],
    'col2': [6, 2, 7, 3, 9],
    'col3': [10, 6, 22, np.nan, 21]
}

df_test = pd.DataFrame(dict_test)


df_new = df_test.loc[:, ['col1', 'col3']].fillna(0).max(axis=1) # 1
print(df_new)
print(type(df_new))

df_new = df_test.loc[:, ['col1', 'col3']].fillna(0).min(axis=1) # 2
print(df_new)
print(type(df_new))

df_new = df_test.loc[:, ['col1', 'col2', 'col3']].fillna(0).max(axis=1) # 3
print(df_new)
print(type(df_new))


-- Result
0    10.0
1     6.0
2    22.0
3     4.0
4    21.0
dtype: float64
<class 'pandas.core.series.Series'>

0    1.0
1    2.0
2    0.0
3    0.0
4    5.0
dtype: float64
<class 'pandas.core.series.Series'>

0    10.0
1     6.0
2    22.0
3     4.0
4    21.0
dtype: float64
<class 'pandas.core.series.Series'>

1. loc를 이용하여 col1, col3를 추출한 후,

fillna를 이용해 col1, col3에 있는 값 중 nan 값을 0으로 채웁니다.

그리고 max를 이용하여 (동일 행(row)에 있는) col1, col3 값을 비교하여 가장 큰 값을 return합니다.

(return된 값의 type은 Series입니다.

2. 1과 동일하지만 min method를 사용하여 가장 작은 값을 return합니다.

3. 컬럼 3개끼리의 비교도 가능합니다.

위 예시에선 min, max의 인자로서 axis=1을 주었습니다.

이것의 뜻은 column(1)을 기준으로 하여 하나의 row에 있는 값끼리 비교하라는 뜻입니다.

import pandas as pd
import numpy as np

dict_test = {
    'col1': [1, 2, np.nan, 4, 5],
    'col2': [6, 2, 7, 3, 9],
    'col3': [10, 6, 22, np.nan, 21]
}

df_test = pd.DataFrame(dict_test)


df_new = df_test.loc[:, ['col1', 'col2', 'col3']].fillna(0).max(axis=0)
print(df_new)
print(type(df_new))



-- Result
col1     5.0
col2     9.0
col3    22.0
dtype: float64
<class 'pandas.core.series.Series'>

axis=0이라는 옵션을 주면

row(0)를 기준으로 하고 하나의 column에 있는 값끼리 비교하라는 뜻입니다.

그래서 결과값을 보면 각 column에 있는 값들 중에서 max값이 return되었습니다.

import pandas as pd
import numpy as np

dict_test = {
    'col1': [1, 2, np.nan, 4, 5],
    'col2': [6, 2, 7, 3, 9],
    'col3': [10, 6, 22, np.nan, 21]
}

df_test = pd.DataFrame(dict_test)


df_test.loc[:, 'col4'] = df_test.loc[:, ['col1', 'col2', 'col3']].fillna(0).max(axis=1)
print(df_test)
print(type(df_test))



-- Result
   col1  col2  col3  col4
0   1.0     6  10.0  10.0
1   2.0     2   6.0   6.0
2   NaN     7  22.0  22.0
3   4.0     3   NaN   4.0
4   5.0     9  21.0  21.0
<class 'pandas.core.frame.DataFrame'>

응용해보면 위처럼 max의 결과로 return된 Series를 원본 DataFrame의 새로운 column(col4)에 삽입할 수도 있습니다.

728x90

'Python > Python Pandas' 카테고리의 다른 글

Python Pandas : set_index (DataFrame의 index 변경하기) (0)	2021.07.15
Python Pandas : groupby & rolling (window function 흉내내기) (0)	2021.07.02
Python Pandas : contains (문자열의 포함여부 판단하기) (0)	2021.06.30
Python Pandas : pandas.io.sql.get_schema (DataFrame 내용을 sql create table syntax로 만들기) (0)	2021.06.13
Python Pandas : values (DataFrame을 numpy arrary 형태로 변환하기) (0)	2021.06.11

'Python/Python Pandas' Related Articles

Comments

달나라 노트

Python Pandas : min, max (컬럼간의 값 비교하기) 본문

Python Pandas : min, max (컬럼간의 값 비교하기)

'Python > Python Pandas' 카테고리의 다른 글

티스토리툴바