首页 文章

如何填写Pandas中每组的最后一行?

提问于
浏览
2

我有一个数据帧 df ,每个组的最后一行(groupby STK_ID )是NaN:

>>> print df
                   sales  opr_pft  net_pft
STK_ID RPT_Date                           
002138 20130331   2.0703   0.3373   0.2829
       20130630      NaN      NaN      NaN
       20130930   7.4993   1.2248   1.1630
       20140122      NaN      NaN      NaN
600004 20130331  11.8429   3.0816   2.1637
       20130630  24.6232   6.2152   4.5135
       20130930  37.9673   9.2088   6.6463
       20140122      NaN      NaN      NaN
600809 20130331  27.9517   9.9426   7.5182
       20130630  40.6460  13.9414   9.8572
       20130930  53.0501  16.8081  11.8605
       20140122      NaN      NaN      NaN

现在我想要fillna每个组的最后一行及其前一行,结果应如下所示:

sales  opr_pft  net_pft
STK_ID RPT_Date                           
002138 20130331   2.0703   0.3373   0.2829
       20130630      NaN      NaN      NaN    **(Not fillna this row)**
       20130930   7.4993   1.2248   1.1630
       20140122   7.4993   1.2248   1.1630
600004 20130331  11.8429   3.0816   2.1637
       20130630  24.6232   6.2152   4.5135
       20130930  37.9673   9.2088   6.6463
       20140122  37.9673   9.2088   6.6463
600809 20130331  27.9517   9.9426   7.5182
       20130630  40.6460  13.9414   9.8572
       20130930  53.0501  16.8081  11.8605
       20140122  53.0501  16.8081  11.8605

我几乎完成了它: df.groupby(level=0).apply(lambda grp: grp.fillna(method='ffill')) ,它生成如下:

sales  opr_pft  net_pft
STK_ID RPT_Date                           
002138 20130331   2.0703   0.3373   0.2829
       20130630   2.0703   0.3373   0.2829
       20130930   7.4993   1.2248   1.1630
       20140122   7.4993   1.2248   1.1630
600004 20130331  11.8429   3.0816   2.1637
       20130630  24.6232   6.2152   4.5135
       20130930  37.9673   9.2088   6.6463
       20140122  37.9673   9.2088   6.6463
600809 20130331  27.9517   9.9426   7.5182
       20130630  40.6460  13.9414   9.8572
       20130930  53.0501  16.8081  11.8605
       20140122  53.0501  16.8081  11.8605

这不是我想要的,它通过组内的行填充 . 那么如何填写Pandas中每组的最后一行?

1 回答

  • 5

    您可以在groupby中使用另一个函数:

    def f(g):
        last = len(g.values)-1
        g.iloc[last,:] = g.iloc[last-1,:]
        return g
    print df.groupby(level=0).apply(f)
    

    输出:

    sales  opr_pft  net_pft
    STK_ID RPT_Date                           
    2138   20130331   2.0703   0.3373   0.2829
           20130630      NaN      NaN      NaN
           20130930   7.4993   1.2248   1.1630
           20140122   7.4993   1.2248   1.1630
    600004 20130331  11.8429   3.0816   2.1637
           20130630  24.6232   6.2152   4.5135
           20130930  37.9673   9.2088   6.6463
           20140122  37.9673   9.2088   6.6463
    600809 20130331  27.9517   9.9426   7.5182
           20130630  40.6460  13.9414   9.8572
           20130930  53.0501  16.8081  11.8605
           20140122  53.0501  16.8081  11.8605
    

相关问题