在Pandas中使用Groupby对象和重新取样-Java 学习之路

我希望在数据框上使用groupby和resample来获得字段的年度计数 . 假设我有一个数据框结构如下：

df = pd.DataFrame({'year': {0: '2017', 1: '2018', 2: '2016', 3: '2018'}, 'month': {0: '1', 1: '2', 2: '3', 3: '4'}, 'day': {0: '1', 1: '1', 2: '1', 3: '3'}})
df['Date']=pd.to_datetime(df)
#Sorry there is probably and easier way to set up the df
df['B']=[1, 2, 3, 1]
df['C']=[2,3,4, 1]
df=df.ix[:, ['Date', 'B', 'C']]

df.groupby('B').resample('A', on='Date')

如何让最后一行代码按B列分组，并且仍然可以按年或月等重新取样？最后，我正在寻找每年由B组成的C计数 . 如果可能的话，我想在这个过程中保持我的索引 . 谢谢 .

2 回答

您可以按B列和date.dt.year进行分组

df.groupby([df['Date'].dt.year, 'B']).C.count().reset_index()

    Date    B   C
0   2016    3   1
1   2017    1   1
2   2018    1   1
3   2018    2   1

Opion 2使用石斑鱼

df.groupby([pd.Grouper(key = 'Date', freq='A'), 'B']).C.count().reset_index()

    Date        B   C
0   2016-12-31  3   1
1   2017-12-31  1   1
2   2018-12-31  1   1
3   2018-12-31  2   1

编辑：使用groupby重新采样的一种循环方式，虽然我不明白为什么会使用它

df.set_index('Date').groupby('B').resample('A').C.count().reset_index()

回复于 2024-04-26T03:28:32+08:00

您可以使用 resample ，但不建议使用

df.groupby('B').apply(lambda x : x.resample('A', on='Date').C.count())
Out[761]: 
B  Date      
1  2017-12-31    1
   2018-12-31    1
2  2018-12-31    1
3  2016-12-31    1
Name: C, dtype: int64

回复于 2024-04-26T03:28:32+08:00

在Pandas中使用Groupby对象和重新取样

2 回答

相关问题