Skip to content Skip to sidebar Skip to footer

Iterrows Performance

I'm working on python 2.7, pandas ( version 0.18.1 ) data frames. I have to modify a column in the data frame based on several columns in the same data frame. For that I have writt

Solution 1:

Assuming your empty cells are NaN values, this gives you the first non-NA value of each row for the group of columns you are interested in:

df[df>0][columns1].bfill(axis=1).iloc[:,0]0NaN1NaN2NaN3NaN4NaN520.06NaN720.08NaN

Thus, this will give you the abs(a-b) you're searching for:

res =(df[df>0][columns1].bfill(axis=1).iloc[:,0]-df[df>0][columns2].bfill(axis=1).iloc[:,0]).abs()
res

0NaN1NaN2NaN3NaN4NaN522977.56NaN7NaN8NaN

You can either combine it with your initialized discount column:

res.combine_first(df.discount)

or fill the blanks:

res.fillna(0)

Post a Comment for "Iterrows Performance"