print (df1.mul(price_per_hour, axis=0))
2 3 4 5 6 \
0 16610.398379 301417.960837 281971.640783 NaN NaN
1 NaN NaN NaN 173932.926829 384146.341463
7
0 NaN
1 141920.731707
#check sum - it is correctly price
print (df1.mul(price_per_hour, axis=0).sum(axis=1))
0 600000.0
1 700000.0
dtype: float64
您还可以根据 days 计算 prices - 将 freq='h' 更改为 freq='D' ,但我认为它不太准确:
def f(x):
rng = pd.date_range(x.start, x.end, freq='D')
return rng.to_series().groupby([rng.month]).size()
df1 = df.apply(f, axis=1)
print (df1)
2 3 4 5 6 7
0 2.0 31.0 29.0 NaN NaN NaN
1 NaN NaN NaN 14.0 30.0 11.0
price_per_hour = df.price / df1.sum(axis=1)
print (price_per_hour)
0 9677.419355
1 12727.272727
dtype: float64
print (df1.mul(price_per_hour, axis=0))
2 3 4 5 6 7
0 19354.83871 300000.0 280645.16129 NaN NaN NaN
1 NaN NaN NaN 178181.818182 381818.181818 140000.0
0 600000.0
1 700000.0
dtype: float64
print (df1.mul(price_per_hour, axis=0).sum(axis=1))
0 600000.0
1 700000.0
dtype: float64
1 回答
您需要检查每个日期范围内的小时数 - 每行 . 因此,请使用DataFrame.apply自定义函数,其中groupby由
months
date_range和aggreagate size .然后通过
price
除以所有小时的price
得到price_per_hour
:每个
month
的所有小时数为mul的最后一次:您还可以根据
days
计算prices
- 将freq='h'
更改为freq='D'
,但我认为它不太准确:由melt,groupby和resample resample重新整形的另一个解决方案 - 也需要groupby by
months
和aggreagate size: