熊猫系列没有绘制到时间序列图表-Java 学习之路

我有一套房价数据 - House Price Data . 当我在Numpy数组中使用数据的子集时，我可以在这个漂亮的时间序列图表中绘制它：

Desired chart BUT using Numpy Array

但是，当我在熊猫系列中使用相同的数据时，图表会像这样变得粗糙：

The lumpy chart using a Pandas Series

如何使用熊猫系列创建平滑的时间序列线图（如第一张图像）？

这是我正在做的以获得漂亮的时间序列图（使用Numpy数组）（在将numpy导入为np，pandas为pd和matplotlib.pyplot为plt之后）：

data = pd.read_csv('HPI.csv', index_col='Date', parse_dates=True) #pull in csv file, make index the date column and parse the dates
brixton = data[data['RegionName'] == 'Lambeth'] # pull out a subset for the region Lambeth
prices = brixton['AveragePrice'].values # create a numpy array of the average price values
plt.plot(prices) #plot
plt.show() #show

这是我正在做的使用熊猫系列获得块状的：

data = pd.read_csv('HPI.csv', index_col='Date', parse_dates=True)
brixton = data[data['RegionName'] == 'Lambeth']
prices_panda = brixton['AveragePrice'] 
plt.plot(prices_panda)
plt.show()

如何将第二个图表显示为一个非常流畅的正确时间序列？

*** This is my first StackOverflow question so please shout if I have left anything out or not been clear ***

任何帮助非常感谢

2 回答

您拥有的文件中的日期格式为日/月/年 . 为了让pandas正确解释这种格式，您可以在 read_csv 调用中使用选项 dayfirst=True .

import pandas as pd
import matplotlib.pyplot as plt

data = pd.read_csv('data/UK-HPI-full-file-2017-08.csv', 
                   index_col='Date', parse_dates=True, dayfirst=True)
brixton = data[data['RegionName'] == 'Lambeth']
prices_panda = brixton['AveragePrice'] 
plt.plot(prices_panda)
plt.show()

enter image description here

回复于 2024-04-28T08:22:31+08:00

2

当您执行 parse_dates=True 时，pandas会以其默认方法（月 - 日 - 年）读取日期 . 您的数据根据英国惯例（日 - 月）格式化 . 因此，您的图表不是每个月的第一天都有数据点，而是显示1月份前12天的数据点，以及每年剩余时间的平坦线 . 您需要重新格式化日期，例如

data.index = pd.to_datetime({'year':data.index.year,'month':data.index.day,'day':data.index.month})

回复于 2024-04-28T08:22:31+08:00

熊猫系列没有绘制到时间序列图表

2 回答

相关问题