如何使用熊猫？-Java 学习之路

我经常对pandas切片操作感到困惑，例如，

import pandas as pd
raw_data = {'regiment': ['Nighthawks', 'Nighthawks', 'Nighthawks', 'Nighthawks', 'Dragoons', 'Dragoons', 'Dragoons', 'Dragoons', 'Scouts', 'Scouts', 'Scouts', 'Scouts'], 
    'company': ['1st', '1st', '2nd', '2nd', '1st', '1st', '2nd', '2nd','1st', '1st', '2nd', '2nd'], 
    'name': ['Miller', 'Jacobson', 'Ali', 'Milner', 'Cooze', 'Jacon', 'Ryaner', 'Sone', 'Sloan', 'Piger', 'Riani', 'Ali'], 
    'preTestScore': [4, 24, 31, 2, 3, 4, 24, 31, 2, 3, 2, 3],
    'postTestScore': [25, 94, 57, 62, 70, 25, 94, 57, 62, 70, 62, 70]}
df = pd.DataFrame(raw_data, columns = ['regiment', 'company', 'name', 'preTestScore', 'postTestScore'])

def get_stats(group):
    return {'min': group.min(), 'max': group.max(), 'count': group.count(), 'mean': group.mean()}
bins = [0, 25, 50, 75, 100]
group_names = ['Low', 'Okay', 'Good', 'Great']
df['categories'] = pd.cut(df['postTestScore'], bins, labels=group_names)
des = df['postTestScore'].groupby(df['categories']).apply(get_stats).unstack()
des.at['Good','mean']

我得到了：

pandas._libs.index.IndexEngine.get_loc（）中的pandas / _libs / index.helx中的类型错误跟踪（最近一次调用最后一次）pandas / _libs.hash中的pandas / _libs / hashtable_class_helper.pxi类型错误：整数是必需的

在处理上述异常期间，发生了另一个异常：

KeyError Traceback（最近一次调用last）in（）----> 1 des.at ['Good'，'mean'] C：\ ProgramData \ Anaconda3 \ lib \ site-packages \ pandas \ core \ indexing.py in getitem（self，key）1867 1868 key = self._convert_key（key） - > 1869 return self.obj._get_value（* key，takeable = self._takeable）1870 1871 def setitem（self，key，value）：C： _get_value中的\ ProgramData \ Anaconda3 \ lib \ site-packages \ pandas \ core \ frame.py（self，index，col，takeable）1983 1984尝试： - > 1985返回engine.get_value（series._values，index）1986除外（ TypeError，ValueError）：1987 pandas._libs.index.IndexEngine.get_value（）中的pandas / _libs / index.pyx pandas._libs.index.IndexEngine.get_value（）pandas / _libs / index中的pandas / _libs / index.pyx . Pyand in pandas._libs.index.IndexEngine.get_loc（）KeyError：'Good'

我怎样才能做到这一点？

提前致谢 .

2 回答

0
问题在于线，
```
des = df['postTestScore'].groupby(df['categories']).apply(get_stats).unstack()
```
在完成一个小组之后'postTestScroe'你得到 "Series" 而不是 "DataFrame" ，如下所示 .

现在，当您尝试使用DataFrame des ".at" 访问标量标签时，它与系列存在并且不存在't recognize label ' .
```
des.at['Good','mean']
```
只要尝试打印 des 打印，您将看到生成的系列 .
```
count   max   mean   min
categories
Low           2.0  25.0  25.00  25.0
Okay          0.0   NaN    NaN   NaN
Good          8.0  70.0  63.75  57.0
Great         2.0  94.0  94.00  94.0
```
回复于 2024-04-29T12:29:58+08:00

由于分类索引，它无法正常工作：

des.index
# Out[322]: CategoricalIndex(['Low', 'Okay', 'Good', 'Great'], categories=['Low', 'Okay', 'Good', 'Great'], ordered=True, name='categories', dtype='category')

尝试更改它：

des.index = des.index.tolist()
des.at['Good','mean']
# Out[326]: 63.75

回复于 2024-04-29T12:29:58+08:00

如何使用熊猫？

2 回答

相关问题