Skip to content Skip to sidebar Skip to footer

How To Create A New Column For Transposed Data

I'm attempting to transpose a row into a new column using the pandas dataframe. Visit ID is the unique identifier. I used df.pivot and df.melt but df.melt seems to do the opposite.

Solution 1:

You can use merge:

out = pd.merge(df[df['Primary or Secondary'] == 'Primary'],
               df[df['Primary or Secondary'] == 'Secondary'],
               on='Visit ID', suffixes=('', '2'))

The rest is just reformatting:

out = out[['Visit ID', 'DX Code', 'DX Code2', 'Insurance', 'Insurance2']] \
          .rename(columns={'Insurance': 'Primary', 'Insurance2': 'Secondary'})
>>> df
   Visit ID  DX Code  DX Code2   Primary Secondary
0         1      123       234     Aetna  Affinity
1         2      789       456  Medicare       VNS

Solution 2:

You can use datar, which uses pandas as backend but implements dplyr-like syntax:

>>> from datar.all import c, f, tribble, tibble, rep, paste0, pivot_wider
>>> 
>>> df = tribble(
...     f.Visit_ID, f.DX_Code, f.Insurance, f.Primary_or_Secondary,
...     1,          123,       "Aetna",     "Primary",
...     1,          234,       "Affinity",  "Secondary",
...     2,          456,       "VNS",       "Secondary",
...     2,          789,       "Medicare",  "Primary",
... )
>>> df
   Visit_ID  DX_Code Insurance Primary_or_Secondary
    <int64>  <int64>  <object>             <object>
0         1      123     Aetna              Primary
1         1      234  Affinity            Secondary
2         2      456       VNS            Secondary
3         2      789  Medicare              Primary

>>> # Create a new df with names and values
>>> df2 = tibble(
...     Visit_ID=rep(df.Visit_ID, 2),
...     name=c(paste0("DX Code", rep(c("", "2"), 2)), df.Primary_or_Secondary),
...     value=c(df.DX_Code, df.Insurance)
... )
>>> 
>>> df2
   Visit_ID       name     value
    <int64>   <object>  <object>
0         1    DX Code       123
1         1   DX Code2       234
2         2    DX Code       456
3         2   DX Code2       789
4         1    Primary     Aetna
5         1  Secondary  Affinity
6         2  Secondary       VNS
7         2    Primary  Medicare
>>> df2 >> pivot_wider()
   Visit_ID  DX Code DX Code2   Primary Secondary
    <int64> <object> <object>  <object>  <object>
0         1      123      234     Aetna  Affinity
1         2      456      789  Medicare       VNS

Disclaimer: I am the author of the datar package.


Post a Comment for "How To Create A New Column For Transposed Data"