Python Pandas : groupby와 rank를 이용해 row number 추가하기

Notice

Recent Posts

Recent Comments

Link

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Tags more

Archives

Today

Total

관리 메뉴

달나라 노트

Python Pandas : groupby와 rank를 이용해 row number 추가하기 본문

Python/Python Pandas

Python Pandas : groupby와 rank를 이용해 row number 추가하기

CosmosProject 2025. 4. 5. 18:34

728x90

groupby와 rank를 이용하면 window function에서 row_number() 함수와 같은 기능을 구현할 수 있습니다.

import pandas as pd
import numpy as np

dict_test = {
    'col1': [1, 1,
             2, 2, 2,
             3, 3, 3, 3, 3,
             4, 4, 4,
             5, 5, 5, 5],
    'col2': [1000, 2000,
             100, 300, 500,
             200, 300, np.nan, 150, 180,
             580, np.nan, 10,
             100, 80, 55, 10]
}

df_test = pd.DataFrame(dict_test)
print(df_test)

df_test.loc[:, 'row_number'] = df_test.groupby(by=['col1'])[['col2']].rank(ascending=False,
                                                                           method='first',
                                                                           pct=False,
                                                                           na_option='bottom')

df_test = df_test.sort_values(by=['col1', 'row_number'],
                              ascending=True,
                              ignore_index=False,
                              inplace=False)
print(df_test)



-- Result
    col1    col2
0      1  1000.0
1      1  2000.0
2      2   100.0
3      2   300.0
4      2   500.0
5      3   200.0
6      3   300.0
7      3     NaN
8      3   150.0
9      3   180.0
10     4   580.0
11     4     NaN
12     4    10.0
13     5   100.0
14     5    80.0
15     5    55.0
16     5    10.0


    col1    col2  row_number
1      1  2000.0         1.0
0      1  1000.0         2.0
4      2   500.0         1.0
3      2   300.0         2.0
2      2   100.0         3.0
6      3   300.0         1.0
5      3   200.0         2.0
9      3   180.0         3.0
8      3   150.0         4.0
7      3     NaN         5.0
10     4   580.0         1.0
12     4    10.0         2.0
11     4     NaN         3.0
13     5   100.0         1.0
14     5    80.0         2.0
15     5    55.0         3.0
16     5    10.0         4.0

참고

https://cosmosproject.tistory.com/770

Python Pandas : rank (순위, 순위 매기기, rank, ranking, DataFrame ranking, DataFrame rank)

pandas의 rank() method는 특정 column을 기준으로 ranking을 매겨줍니다. Syntax rank(ascending={True, False}, method={'min', 'max', 'dense', 'first', 'average'}, pct={True, False}, na_option={'keep', 'top', 'bottom}, numeric_only={True, False})

cosmosproject.tistory.com

https://cosmosproject.tistory.com/12

Python Pandas : DataFrame.groupby

DataFrame.groupby groupby는 특정 컬럼에 존재하는 값들에 대해서 동일한 값을 가진 행끼리 그룹화하고 그룹화된 행들에 어떤 연산(합, 평균, 최대, 최소 등)을 해주는 기능을 가집니다. 먼저 test용 DataF

cosmosproject.tistory.com

728x90

'Python > Python Pandas' 카테고리의 다른 글

Python Pandas : cummin, cummax (누적최소값, 누적최대값) (0)	2024.08.01
Python Pandas : cumsum, cumprod (누적합, 누적곱) (0)	2024.08.01
Python Pandas : SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. (0)	2024.07.04
Python Pandas : columns (DataFrame의 column 정보 불러오기.) (0)	2024.06.28
Python Pandas : Percentile Rank 계산하기 (백분위 계산하기) (0)	2024.03.25

'Python/Python Pandas' Related Articles

Comments

달나라 노트

Python Pandas : groupby와 rank를 이용해 row number 추가하기 본문

Python Pandas : groupby와 rank를 이용해 row number 추가하기

'Python > Python Pandas' 카테고리의 다른 글

티스토리툴바