Skip to content Skip to sidebar Skip to footer

Pandas - Update/merge 2 Dataframes Based On Multiple Matching Column Values

I have 2 dataframes left_df and right-df, which both have 20 columns with identical names and dtypes. right_df also has 2 additional columns with unique values on every row. I want

Solution 1:

Worked it out thanks to this post and the Pandas documentation:

First, it's a .merge I need, and I specify the suffixes as '_r' for only the columns to be copied from the right_df / for the old values I'm updating:

merged_df = pd.merge(left_df, right_df, on=['col_0', 'col_1'], suffixes=(None, '_r'))

This yields a new dataframe with rows containing both the new and old columns, only for rows in each dataframe where the values in columns on=['col_0', 'col_1'] are a match. Then I drop the "old" columns by using a regex filter on the text '_r':

merged_df.drop(list(merged_df.filter(regex = '_r')), axis=1, inplace=True)

This yields a dataframe with only the "modified" rows and no unmodified rows, which is close enough for what I need.

  col_0 col_1 col_2 col_3 col_4 col_5 col_6 col_7  col_8  col_9
00     A   newnewnewnewnewnew  uid_0  uid_a
11     B   newnewnewnewnewnew  uid_1  uid_b
22     C   newnewnewnewnewnew  uid_2  uid_c
34     E   newnewnewnewnewnew  uid_4  uid_e
45     F   newnewnewnewnewnew  uid_5  uid_f

Solution 2:

Try this

new_df=pd.concat([left_df,right_df.iloc[:,-1:-3]],axis=1)

Post a Comment for "Pandas - Update/merge 2 Dataframes Based On Multiple Matching Column Values"