首页 文章

Pandas按日期合并两个数据帧,最后是整个NaN列

提问于
浏览
3

遗憾的是,我经历过一些类似查询的例子无济于事 . 我有两个需要组合的数据帧 .

df1

.       DATE            HIGH	        LOW		OPEN		CLOSE
0	2013-01-04	10734.23	10602.24	10604.50	10688.11
1	2013-01-07	10743.69	10589.70	10743.69	10599.01
2	2013-01-08	10602.12	10463.43	10544.21	10508.06
3	2013-01-09	10620.70	10398.61	10405.67	10578.57
4	2013-01-10	10686.12	10619.65	10635.11	10652.64
5	2013-01-11	10830.43	10748.06	10786.14	10801.57
6	2013-01-15	10952.31	10851.66	10914.65	10879.08
7	2013-01-16	10806.41	10591.30	10806.41	10600.44

df2

.        Date          sentiment
0	2013-01-01	    -0.027282
1	2013-01-02	    0.063613
2	2013-01-03	    0.091363
3	2013-01-04	    0.092818
4	2013-01-05	    -0.019002
5	2013-01-06	    -0.033752
6	2013-01-07	    0.060038
7	2013-01-08	    0.081649
8	2013-01-09	    -0.031924
9	2013-01-10	    0.109111
10	2013-01-11	  -0.057070
11	2013-01-12	  -0.052431
12	2013-01-13	  0.014726
13	2013-01-14	  0.047232
14	2013-01-15	  0.060790
15	2013-01-16	  -0.067828
16	2013-01-17	  -0.035174

code used: merged_left = pd.merge(left = df1,right = df2,how = 'left',left_on = 'Date',right_on = 'Date')

所以我在情绪数据中丢失了所有内容,如下所示:

.         Date		HIGH		LOW		OPEN		CLOSE		sentiment
0	2013-01-04	10734.23	10602.24	10604.50	10688.11	NaN
1	2013-01-07	10743.69	10589.70	10743.69	10599.01	NaN
2	2013-01-08	10602.12	10463.43	10544.21	10508.06	NaN
3	2013-01-09	10620.70	10398.61	10405.67	10578.57	NaN
4	2013-01-10	10686.12	10619.65	10635.11	10652.64	NaN
5	2013-01-11	10830.43	10748.06	10786.14	10801.57	NaN
6	2013-01-15	10952.31	10851.66	10914.65	10879.08	NaN
7	2013-01-16	10806.41	10591.30	10806.41	10600.44	NaN

它应该如下所示,df2是一个更大的数据帧,有2157行,许多日期不在df(1447行)......这些日期不是必需的,基本上我只想要相应的情绪数据 dates that exist in df1

.       Date		HIGH		LOW		OPEN		CLOSE		sentiment
0	2013-01-04	10734.23	10602.24	10604.50	10688.11	0.092818
1	2013-01-07	10743.69	10589.70	10743.69	10599.01	0.060038
2	2013-01-08	10602.12	10463.43	10544.21	10508.06	0.081649
3	2013-01-09	10620.70	10398.61	10405.67	10578.57	-0.031924
4	2013-01-10	10686.12	10619.65	10635.11	10652.64	0.109111
5	2013-01-11	10830.43	10748.06	10786.14	10801.57	-0.057070
6	2013-01-15	10952.31	10851.66	10914.65	10879.08	0.060790
7	2013-01-16	10806.41	10591.30	10806.41	10600.44	-0.067828

任何帮助都会非常感激...整个周末都在这个问题上 .

1 回答

  • 0

    问题是需要两列中的日期时间以及默认的内部联接,因此应省略 how='inner'

    df1['Date'] = pd.to_datetime(df1['Date'])
     df2['Date'] = pd.to_datetime(df2['Date'])
     merged_left = pd.merge(df1, df2, on='Date')
    

相关问题