首页 文章

Pandas绘图错误地对图表上的分箱值进行排序

提问于
浏览
3

我正在使用Pandas绘制一个DataFrame,其中包含三种类型的列:兴趣,性别和经验值 .

我想将Experience点分成特定范围,然后按照分箱值,兴趣和性别对DataFrame进行分组 . 然后,我想根据特定性别(例如:男性)绘制兴趣计数 .

使用下面的代码,我能够得到我想要的图,但是,Pandas错误地对x轴上的分箱值进行排序(参见我所说的附图) .

enter image description here

请注意,当我打印DataFrame时,分箱值的顺序正确,但在图表中,分箱值的分类不正确 .

Experience Points  Interest  Gender
(0, 8]             Bike      Female     9
                             Male       5
                   Hike      Female     6
                             Male      10
                   Swim      Female     7
                             Male       7
(8, 16]            Bike      Female     8
                             Male       3
                   Hike      Female     4
                             Male       7
                   Swim      Female    10
                             Male       4
(16, 24]           Bike      Female     4
                             Male       6
                   Hike      Female    10
...

我的代码:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
import random

matplotlib.style.use('ggplot')


interest = ['Swim','Bike','Hike']
gender = ['Male','Female']
experience_points = np.arange(0,200)

df = pd.DataFrame({'Interest':[random.choice(interest) for x in range(1000)],
                   'Gender':[random.choice(gender) for x in range(1000)],
                   'Experience Points':[random.choice(experience_points) for x in range(1000)]})

bins = np.arange(0,136,8)
exp_binned = pd.cut(df['Experience Points'],np.append(bins,df['Experience Points'].max()+1))

exp_distribution = df.groupby([exp_binned,'Interest','Gender']).size()

# Printed dataframe has correct sorting by binned values 
print exp_distribution 

#Plotted dataframe has incorrect sorting of binned values 
exp_distribution.unstack(['Gender','Interest'])['Male'].plot(kind='bar') 

plt.show()

Troubleshooting Steps Tried:

使用 plot(kind='bar',sort_columns=True) 无法解决问题

仅通过分箱值进行分组然后绘制DOES来解决问题,但之后我无法按兴趣或性别进行分组 . 例如,以下工作:

exp_distribution = df.groupby([exp_binned]).size()
exp_distribution.plot(kind='bar')

1 回答

  • 3

    unstack() 搞砸了订单,必须恢复索引订单 . 您可能想要为此提交错误报告 .

    解决方法:

    exp_distrubtion.unstack(['Gender','Interest']).ix[exp_distrubtion.index.get_level_values(0).unique(),
                                                      'Male'].plot(kind='bar')
    

    enter image description here

相关问题