Merging List Of Dfs With Alternating Columns Output Using Pandas
I have the following codes: import pandas as pd rep1 = pd.DataFrame.from_items([('Probe', ['x', 'y', 'z']), ('Gene', ['foo', 'bar', 'qux']), ('RP1',[1.00,23.22,11.12]),('RP1',['A'
Solution 1:
You could dedupe the column names. Here's a kind of hacky way:
In [11]: list(rep1.columns[0:2]) + [rep1.columns[2] + "_value"] + [rep1.columns[2] + "_letter"]
Out[11]: ['Probe', 'Gene', 'RP1_value', 'RP1_letter']
In [12]: forrepin tmp:
.....: rep.columns = list(rep.columns[0:2]) + [rep.columns[2] + "_value"] + [rep.columns[2] + "_letter"]
In [13]: reduce(pd.merge,tmp)
Out[13]:
Probe Gene RP1_value RP1_letter RP2_value RP2_letter RP3_value RP3_letter
0 x foo 1.00 A 3.33 G 99.99 M
1 y bar 23.22 B 77.22 I 98.29 P
You also need to specify it as an outer merge (to get the NaN rows):
In[21]: reduce(lambda x, y: pd.merge(x, y, how='outer'),tmp)
Out[21]:
ProbeGeneRP1_valueRP1_letterRP2_valueRP2_letterRP3_valueRP3_letter0xfoo1.00A3.33G99.99M1ybar23.22B77.22I98.29P2zqux11.12C18.12KNaNNaN3kkuxNaNNaNNaNNaN8.10J
Post a Comment for "Merging List Of Dfs With Alternating Columns Output Using Pandas"