How To Create A New Column For Transposed Data
I'm attempting to transpose a row into a new column using the pandas dataframe. Visit ID is the unique identifier. I used df.pivot and df.melt but df.melt seems to do the opposite.
Solution 1:
You can use merge
:
out = pd.merge(df[df['Primary or Secondary'] == 'Primary'],
df[df['Primary or Secondary'] == 'Secondary'],
on='Visit ID', suffixes=('', '2'))
The rest is just reformatting:
out = out[['Visit ID', 'DX Code', 'DX Code2', 'Insurance', 'Insurance2']] \
.rename(columns={'Insurance': 'Primary', 'Insurance2': 'Secondary'})
>>> df
Visit ID DX Code DX Code2 Primary Secondary
0 1 123 234 Aetna Affinity
1 2 789 456 Medicare VNS
Solution 2:
You can use datar
, which uses pandas as backend but implements dplyr
-like syntax:
>>> from datar.all import c, f, tribble, tibble, rep, paste0, pivot_wider
>>>
>>> df = tribble(
... f.Visit_ID, f.DX_Code, f.Insurance, f.Primary_or_Secondary,
... 1, 123, "Aetna", "Primary",
... 1, 234, "Affinity", "Secondary",
... 2, 456, "VNS", "Secondary",
... 2, 789, "Medicare", "Primary",
... )
>>> df
Visit_ID DX_Code Insurance Primary_or_Secondary
<int64> <int64> <object> <object>
0 1 123 Aetna Primary
1 1 234 Affinity Secondary
2 2 456 VNS Secondary
3 2 789 Medicare Primary
>>> # Create a new df with names and values
>>> df2 = tibble(
... Visit_ID=rep(df.Visit_ID, 2),
... name=c(paste0("DX Code", rep(c("", "2"), 2)), df.Primary_or_Secondary),
... value=c(df.DX_Code, df.Insurance)
... )
>>>
>>> df2
Visit_ID name value
<int64> <object> <object>
0 1 DX Code 123
1 1 DX Code2 234
2 2 DX Code 456
3 2 DX Code2 789
4 1 Primary Aetna
5 1 Secondary Affinity
6 2 Secondary VNS
7 2 Primary Medicare
>>> df2 >> pivot_wider()
Visit_ID DX Code DX Code2 Primary Secondary
<int64> <object> <object> <object> <object>
0 1 123 234 Aetna Affinity
1 2 456 789 Medicare VNS
Disclaimer: I am the author of the datar package.
Post a Comment for "How To Create A New Column For Transposed Data"