Pandas - Update/merge 2 Dataframes Based On Multiple Matching Column Values
I have 2 dataframes left_df and right-df, which both have 20 columns with identical names and dtypes. right_df also has 2 additional columns with unique values on every row. I want
Solution 1:
Worked it out thanks to this post and the Pandas documentation:
First, it's a .merge
I need, and I specify the suffixes as '_r'
for only the columns to be copied from the right_df
/ for the old values I'm updating:
merged_df = pd.merge(left_df, right_df, on=['col_0', 'col_1'], suffixes=(None, '_r'))
This yields a new dataframe with rows containing both the new and old columns, only for rows in each dataframe where the values in columns on=['col_0', 'col_1']
are a match. Then I drop the "old" columns by using a regex filter on the text '_r'
:
merged_df.drop(list(merged_df.filter(regex = '_r')), axis=1, inplace=True)
This yields a dataframe with only the "modified" rows and no unmodified rows, which is close enough for what I need.
col_0 col_1 col_2 col_3 col_4 col_5 col_6 col_7 col_8 col_9
00 A newnewnewnewnewnew uid_0 uid_a
11 B newnewnewnewnewnew uid_1 uid_b
22 C newnewnewnewnewnew uid_2 uid_c
34 E newnewnewnewnewnew uid_4 uid_e
45 F newnewnewnewnewnew uid_5 uid_f
Solution 2:
Try this
new_df=pd.concat([left_df,right_df.iloc[:,-1:-3]],axis=1)
Post a Comment for "Pandas - Update/merge 2 Dataframes Based On Multiple Matching Column Values"