遗憾的是,我经历过一些类似查询的例子无济于事 . 我有两个需要组合的数据帧 .
df1
. DATE HIGH LOW OPEN CLOSE
0 2013-01-04 10734.23 10602.24 10604.50 10688.11
1 2013-01-07 10743.69 10589.70 10743.69 10599.01
2 2013-01-08 10602.12 10463.43 10544.21 10508.06
3 2013-01-09 10620.70 10398.61 10405.67 10578.57
4 2013-01-10 10686.12 10619.65 10635.11 10652.64
5 2013-01-11 10830.43 10748.06 10786.14 10801.57
6 2013-01-15 10952.31 10851.66 10914.65 10879.08
7 2013-01-16 10806.41 10591.30 10806.41 10600.44
df2
. Date sentiment
0 2013-01-01 -0.027282
1 2013-01-02 0.063613
2 2013-01-03 0.091363
3 2013-01-04 0.092818
4 2013-01-05 -0.019002
5 2013-01-06 -0.033752
6 2013-01-07 0.060038
7 2013-01-08 0.081649
8 2013-01-09 -0.031924
9 2013-01-10 0.109111
10 2013-01-11 -0.057070
11 2013-01-12 -0.052431
12 2013-01-13 0.014726
13 2013-01-14 0.047232
14 2013-01-15 0.060790
15 2013-01-16 -0.067828
16 2013-01-17 -0.035174
code used: merged_left = pd.merge(left = df1,right = df2,how = 'left',left_on = 'Date',right_on = 'Date')
所以我在情绪数据中丢失了所有内容,如下所示:
. Date HIGH LOW OPEN CLOSE sentiment
0 2013-01-04 10734.23 10602.24 10604.50 10688.11 NaN
1 2013-01-07 10743.69 10589.70 10743.69 10599.01 NaN
2 2013-01-08 10602.12 10463.43 10544.21 10508.06 NaN
3 2013-01-09 10620.70 10398.61 10405.67 10578.57 NaN
4 2013-01-10 10686.12 10619.65 10635.11 10652.64 NaN
5 2013-01-11 10830.43 10748.06 10786.14 10801.57 NaN
6 2013-01-15 10952.31 10851.66 10914.65 10879.08 NaN
7 2013-01-16 10806.41 10591.30 10806.41 10600.44 NaN
它应该如下所示,df2是一个更大的数据帧,有2157行,许多日期不在df(1447行)......这些日期不是必需的,基本上我只想要相应的情绪数据 dates that exist in df1 :
. Date HIGH LOW OPEN CLOSE sentiment
0 2013-01-04 10734.23 10602.24 10604.50 10688.11 0.092818
1 2013-01-07 10743.69 10589.70 10743.69 10599.01 0.060038
2 2013-01-08 10602.12 10463.43 10544.21 10508.06 0.081649
3 2013-01-09 10620.70 10398.61 10405.67 10578.57 -0.031924
4 2013-01-10 10686.12 10619.65 10635.11 10652.64 0.109111
5 2013-01-11 10830.43 10748.06 10786.14 10801.57 -0.057070
6 2013-01-15 10952.31 10851.66 10914.65 10879.08 0.060790
7 2013-01-16 10806.41 10591.30 10806.41 10600.44 -0.067828
任何帮助都会非常感激...整个周末都在这个问题上 .
1 回答
问题是需要两列中的日期时间以及默认的内部联接,因此应省略
how='inner'
: