Skip to content Skip to sidebar Skip to footer

Pandas: Get String Value With Most Occurrence In Group

I have the following DataFrame: item response 1 A 1 A 1 B 2 A 2 A I want to add a column with the most given respon

Solution 1:

There is pd.Series.mode:

df.groupby('item').response.transform(pd.Series.mode)
Out[28]: 
0A1A2A3C4CName: response, dtype: object

Solution 2:

Use value_counts and return first index value:

df["responseCount"] = (df.groupby("item")["response"]
                        .transform(lambda x: x.value_counts().index[0]))

print (df)
   item response responseCount
0     1        A             A
1     1        A             A
2     1        B             A
3     2        C             C
4     2        C             C

Or collections.Counter.most_common:

from collections import Counter

df["responseCount"] = (df.groupby("item")["response"]
                         .transform(lambda x: Counter(x).most_common(1)[0][0]))

print (df)
   item response responseCount
01        A             A
11        A             A
21        B             A
32        C             C
42        C             C

EDIT:

Problem is with one or multiple NaNs only groups, solution is filter with if-else:

print (df)
   item response
01        A
11        A
22      NaN
32      NaN
43      NaN

def f(x):
    s = x.value_counts()
    print (s)

    A    2
    Name: 1, dtype: int64
    Series([], Name: 2, dtype: int64)
    Series([], Name: 3, dtype: int64)

    #return np.nan if s.empty else s.index[0]
    return np.nan iflen(s) == 0else s.index[0]

df["responseCount"] = df.groupby("item")["response"].transform(f)
print (df)
   item response responseCount
01        A             A
11        A             A
22      NaN           NaN
32      NaN           NaN
43      NaN           NaN

Solution 3:

You can use statistics.mode from standard library:

from statistics import mode

df['mode'] = df.groupby('item')['response'].transform(mode)

print(df)

   item response mode
0     1        A    A
1     1        A    A
2     1        B    A
3     2        C    C
4     2        C    C

Post a Comment for "Pandas: Get String Value With Most Occurrence In Group"