Creating A New Column Depending On The Equality Of Two Other Columns
l want to compare the values of two columns where I create a new column bin_crnn. I want 1 if they are equals or 0 if not. # coding: utf-8 import pandas as pd df = pd.read_csv('fi
Solution 1:
You need cast boolean mask to int
with astype
:
df['bin_crnn'] = (df['crnn_pred']==df['manual_raw_value']).astype(int)
Sample:
df = pd.DataFrame({'crnn_pred':[1,2,5], 'manual_raw_value':[1,8,5]})
print (df)
crnn_pred manual_raw_value
0 1 1
1 2 8
2 5 5
print (df['crnn_pred']==df['manual_raw_value'])
0 True
1 False
2 True
dtype: bool
df['bin_crnn'] = (df['crnn_pred']==df['manual_raw_value']).astype(int)
print (df)
crnn_pred manual_raw_value bin_crnn
0 1 1 1
1 2 8 0
2 5 5 1
You get error, because if compare columns output is not scalar, but Series
(array
) of True
and False
values.
So need all
or
any
for return scalar True
or False
.
I think better it explain this answer.
Solution 2:
One fast approach is to use np.where.
import numpy as np
df['test'] = np.where(df['crnn_pred']==df['manual_raw_value'], 1, 0)
Solution 3:
No need for a loop or if statement, just need to set a new column using a boolean mask.
df['bin_crnn'].loc[df['crnn_pred']==df['manual_raw_value']] = 1
df['bin_crnn'].fillna(0, inplace = True)
Solution 4:
Another quick way just using Pandas and not Numpy is
df['columns_are_equal'] = df.apply(lambda x: int(x['column_a'] ==x['column_b']), axis=1)
Solution 5:
You are comparing 2 columns, try this..
bin_crnn = []
for index, rowin df.iterrows():
if row['crnn_pred'] ==row['manual_raw_value']:
bin_crnn.append(1)
else:
bin_crnn.append(0)
df['bin_crnn'] = bin_crnn
Post a Comment for "Creating A New Column Depending On The Equality Of Two Other Columns"