Skip to content Skip to sidebar Skip to footer

Pandas - Merge Two Data Frames, Create New Column, Append Values To Array

I am looking to merge two data frames on the same id in each dataframe, but to create a new column and append any values in a specified column to an array in the new dataframe colu

Solution 1:

try this:

In [35]: pd.merge(df1, df2.groupby('ID').Tag.apply(list).reset_index(), on='ID', how='left')
Out[35]:
   ID  X1  X2  X3         Tag
02112       [Two]
11011  [One, Two]
20212         NaN
31022  [One, Two]
40022         NaN

alternatively you can use map() method:

In [38]: df1['Merged_Tags'] = df1.ID.map(df2.groupby('ID').Tag.apply(list))

In [39]: df1
Out[39]:
   ID  X1  X2  X3 Merged_Tags
02112       [Two]
11011  [One, Two]
20212         NaN
31022  [One, Two]
40022         NaN

Solution 2:

>>> df1.join(df2.groupby('ID').Tag.apply(lambda group: list(group)), on='ID')

   ID  X1  X2  X3         Tag
01102  [One, Two]
10101         NaN
20122         NaN
31220  [One, Two]
42100       [Two]

Post a Comment for "Pandas - Merge Two Data Frames, Create New Column, Append Values To Array"