Replace Nan In One Column With The Value From Another Column In Pandas: What's Wrong With My Code
I have a dataframe like below. I need to replace the nan in column a with the corresponding value from column b in the same row. df = pd.DataFrame({'a': [1,2,3,4,np.nan, np.nan, 5
Solution 1:
You need to use axis=1
, also, youll have to use pd.isnull(row['a'])
:
In [6]: df.apply(lambda row: row['b'] if pd.isnull(row['a']) else row['a'], axis=1)
Out[6]:
0 1.0
1 2.0
2 3.0
3 4.0
4 8.0
5 9.0
6 5.0
dtype: float64
Although, you shouldn't be using .apply
in the first place, use fillna
:
In [9]: df.a.fillna(df.b)
Out[9]:
0 1.0
1 2.0
2 3.0
3 4.0
4 8.0
5 9.0
6 5.0
Name: a, dtype: float64
More generally, for any predicate, use pd.Series.where
:
In [32]: df.a.where(pd.notnull, df.b)
Out[32]:
0 1.0
1 2.0
2 3.0
3 4.0
4 8.0
5 9.0
6 5.0
Name: a, dtype: float64
Solution 2:
You must pass index=1 to operate on rows. This code here works for me:
import pandas as pd
import numpy as np
df = pd.DataFrame({'a': [1,2,3,4,np.nan, np.nan, 5],
'b': [4,5,6,7,8,9,1]})
df['a'] =df.apply(lambda row: row['b'] if pd.isnull(row['a']) else row['a'], axis=1)
df
Post a Comment for "Replace Nan In One Column With The Value From Another Column In Pandas: What's Wrong With My Code"