pandas.Dataframe（）混合数据类型和奇怪的.fillna（）行为-Java 学习之路

我有一个数据框，它有两个dtypes：Object（期待字符串）和Datetime（预期日期时间） . 我不明白这种行为及其影响我的fillna（）的原因 .

使用inplace = str调用.fillna（）并使用.astype（str）更改表示为int64的数据

没有它就调用.fillna（）什么都不做 .

我知道pandas / numpy dtypes与python native不同，但它是正确的行为还是我得到了一些非常错误的东西？

样品：

import random
import numpy
sample = pd.DataFrame({'A': [random.choice(['aabb',np.nan,'bbcc','ccdd']) for x in range(15)],
                       'B': [random.choice(['2019-11-30','2020-06-30','2018-12-31','2019-03-31']) for x in range(15)]})
sample.loc[:, 'B'] = pd.to_datetime(sample['B'])

for col in sample.select_dtypes(include='object').columns.tolist():
    sample.loc[:, col].astype(str).apply(lambda x: str(x).strip().lower()).fillna('NULL')

for col in sample.columns:
    print(sample[col].value_counts().head(15))
    print('\n')

这里既没有'NULL'也没有'nan' . 添加.replace（'nan'，'NULL'），但仍然没有 . 你能告诉我想要找什么吗？非常感谢 .

1 回答

1
这里的问题是将缺失值转换为 string s，因此 fillna 无法正常工作 . 解决方案是使用pandas函数Series.str.strip和Series.str.lower使用缺失值非常好：
```
for col in sample.select_dtypes(include='object').columns:
    sample[col] = sample[col].str.strip().str.lower().fillna('NULL')
```
回复于 2024-05-14T21:13:00+08:00

pandas.Dataframe（）混合数据类型和奇怪的.fillna（）行为

1 回答

相关问题