Skip to content Skip to sidebar Skip to footer

Creating A New Column Depending On The Equality Of Two Other Columns

l want to compare the values of two columns where I create a new column bin_crnn. I want 1 if they are equals or 0 if not. # coding: utf-8 import pandas as pd df = pd.read_csv('fi

Solution 1:

You need cast boolean mask to int with astype:

df['bin_crnn'] = (df['crnn_pred']==df['manual_raw_value']).astype(int)

Sample:

df = pd.DataFrame({'crnn_pred':[1,2,5], 'manual_raw_value':[1,8,5]})
print (df)
   crnn_pred  manual_raw_value
0          1                 1
1          2                 8
2          5                 5

print (df['crnn_pred']==df['manual_raw_value'])
0     True
1    False
2     True
dtype: bool

df['bin_crnn'] = (df['crnn_pred']==df['manual_raw_value']).astype(int)
print (df)
   crnn_pred  manual_raw_value  bin_crnn
0          1                 1         1
1          2                 8         0
2          5                 5         1

You get error, because if compare columns output is not scalar, but Series (array) of True and False values.

So need all or any for return scalar True or False.

I think better it explain this answer.

Solution 2:

One fast approach is to use np.where.

import numpy as np
df['test'] = np.where(df['crnn_pred']==df['manual_raw_value'], 1, 0)

Solution 3:

No need for a loop or if statement, just need to set a new column using a boolean mask.

df['bin_crnn'].loc[df['crnn_pred']==df['manual_raw_value']] = 1
df['bin_crnn'].fillna(0, inplace = True) 

Solution 4:

Another quick way just using Pandas and not Numpy is

df['columns_are_equal'] = df.apply(lambda x: int(x['column_a'] ==x['column_b']), axis=1)

Solution 5:

You are comparing 2 columns, try this..

bin_crnn = []
for index, rowin df.iterrows():
    if row['crnn_pred'] ==row['manual_raw_value']:
        bin_crnn.append(1)
    else:
        bin_crnn.append(0)
df['bin_crnn'] = bin_crnn

Post a Comment for "Creating A New Column Depending On The Equality Of Two Other Columns"