首页 文章

熊猫 - 具有条件公式的Groupby

提问于
浏览
4
Survived  SibSp  Parch
0         0      1      0
1         1      1      0
2         1      0      0
3         1      1      0
4         0      0      1

鉴于上述数据框架,有一个优雅的方式 groupby 条件?我想根据以下条件将数据分成两组:

(df['SibSp'] > 0) | (df['Parch'] > 0) =   New Group -"Has Family"
 (df['SibSp'] == 0) & (df['Parch'] == 0) = New Group - "No Family"

然后采取这两个组的方法,最终输出如下:

SurvivedMean
 Has Family    Mean
 No Family     Mean

可以使用groupby完成,还是必须使用上述条件语句追加新列?

谢谢!

3 回答

  • 7

    您可以在列表中定义条件,并使用下面的函数 group_by_condition 为每个条件创建筛选列表 . 之后,您可以使用模式匹配选择结果项:

    df = [
      {"Survived": 0, "SibSp": 1, "Parch": 0},
      {"Survived": 1, "SibSp": 1, "Parch": 0},
      {"Survived": 1, "SibSp": 0, "Parch": 0}]
    
    conditions = [
      lambda x: (x['SibSp'] > 0) or (x['Parch'] > 0),  # has family
      lambda x: (x['SibSp'] == 0) and (x['Parch'] == 0)  # no family
    ]
    
    def group_by_condition(l, conditions):
        return [[item for item in l if condition(item)] for condition in conditions]
    
    [has_family, no_family] = group_by_condition(df, conditions)
    
  • 1

    一种简单的分组方法是使用这两列的总和 . 如果它们中的任何一个为正,则结果将大于1.并且只要长度与DataFrame的长度相同,groupby就接受任意数组,因此您不需要添加新列 .

    family = np.where((df['SibSp'] + df['Parch']) >= 1 , 'Has Family', 'No Family')
    df.groupby(family)['Survived'].mean()
    Out: 
    Has Family    0.5
    No Family     1.0
    Name: Survived, dtype: float64
    
  • 1

    如果列 SibSpParch 中的值从不像 0 那样只使用一个条件:

    m1 = (df['SibSp'] > 0) | (df['Parch'] > 0)
    
    df = df.groupby(np.where(m1, 'Has Family', 'No Family'))['Survived'].mean()
    print (df)
    Has Family    0.5
    No Family     1.0
    Name: Survived, dtype: float64
    

    如果不可能使用首先使用两个条件:

    m1 = (df['SibSp'] > 0) | (df['Parch'] > 0)
    m2 = (df['SibSp'] == 0) & (df['Parch'] == 0)
    a = np.where(m1, 'Has Family', 
        np.where(m2, 'No Family', 'Not'))
    
    df = df.groupby(a)['Survived'].mean()
    print (df)
    Has Family    0.5
    No Family     1.0
    Name: Survived, dtype: float64
    

相关问题