Merging Multiple Pandas Datasets With Non-unique Index
I have several similarly structured pandas dataframes stored in a dictionary. I access a dataframe in the following way. ex_dict[df1] date df1price1 df1price2 10-20-2015
Solution 1:
You can use a concat
followed by a groupby('date')
to flatten the result.
In [22]: pd.concat([df1,df2,df3]).groupby('date').max()
Out[22]:
df1price1 df1price2 df2price1 df2price2 df3price1 df3price2
date
10-20-201510015011014010015010-21-20159010090110NaNNaN10-22-2015100140NaNNaN9010010-23-2015NaNNaN11012080130
Edit: As BrenBarn points out in the comments, you can use concat(axis=1)
if you set the join column as the index of your dataframes:
df1.index = df1.date
df2.index = df2.date
df3.index = df3.date
In [44]: pd.concat([df1,df2,df3],axis=1)
Out[44]:
date df1price1 df1price2 date df2price1 \
10-20-201510-20-201510015010-20-201511010-21-201510-21-20159010010-21-20159010-22-201510-22-2015100140 NaN NaN
10-23-2015 NaN NaN NaN 10-23-2015110
df2price2 date df3price1 df3price2
10-20-201514010-20-201510015010-21-2015110 NaN NaN NaN
10-22-2015 NaN 10-22-20159010010-23-201512010-23-201580130
Solution 2:
You could use multiple merge on date
column:
df1.merge(df2, on='date', how='outer').merge(df3, on='date', how='outer').set_index('date')
In [107]: df1.merge(df2, on='date', how='outer').merge(df3, on='date', how='outer').set_index('date')
Out[107]:
df1price1 df1price2 df2price1 df2price2 df3price1 df3price2
date
10-20-201510015011014010015010-21-20159010090110 NaN NaN
10-22-2015100140 NaN NaN 9010010-23-2015 NaN NaN 11012080130
Some explanation: First you merging df1
and df2
on column date
with joining outer
. The the resulted dataframe you merging with df3
with the same attributes. And finnaly setting index date
for your resulted dateframe. If your dataframes have date
columns as index you could first do reset_index
for each of them and merge on the column name containing date
Post a Comment for "Merging Multiple Pandas Datasets With Non-unique Index"