Skip to content Skip to sidebar Skip to footer

How To Compute A New Column Based On The Values Of Other Columns In Pandas - Python

Let's say my data frame contains these data: >>> df = pd.DataFrame({'a':['l1','l2','l1','l2','l1','l2'], 'b':['1','2','2','1','2','2']}) >>>

Solution 1:

df = pd.DataFrame({'a': numpy.random.choice(['l1', 'l2'], 1000000),
                   'b': numpy.random.choice(['1', '2'], 1000000)})

A fast solution assuming only two distinct values:

%timeit df['c'] = ((df.a == 'l1') == (df.b == '1')).astype(int)

10 loops, best of 3: 178 ms per loop

@Viktor Kerkes:

%timeit df['c'] = (df.a.str[-1] == df.b).astype(int)

1 loops, best of 3: 412 ms per loop

@user1470788:

%timeit df['c'] = (((df['a'] == 'l1')&(df['b']=='1'))|((df['a'] == 'l2')&(df['b']=='2'))).astype(int)

1 loops, best of 3: 363 ms per loop

@herrfz

%timeit df['c'] = (df.a.apply(lambda x: x[1:])==df.b).astype(int)

1 loops, best of 3: 387 ms per loop

Solution 2:

You can also use the string methods.

df['c'] = (df.a.str[-1] == df.b).astype(int)

Solution 3:

df['c'] = (df.a.apply(lambda x: x[1:])==df.b).astype(int)

Solution 4:

You can just use logical operators. I'm not sure why you're using strings of 1 and 2 rather than ints, but here's a solution. The astype at the end converts it from boolean to 0's and 1's.

df['c'] = (((df['a'] == 'l1')&(df['b']=='1'))|((df['a'] == 'l2')&(df['b']=='2'))).astype(int)

Post a Comment for "How To Compute A New Column Based On The Values Of Other Columns In Pandas - Python"