Find Most Frequent Observation In Group
DataFrame: B = pd.DataFrame({'b':['II','II','II','II','II','I','I','I'], 'MOST_FREQUENT':['1', '2', '2', '1', '1','1','2','2']}) I need to get the most frequent
Solution 1:
You can use apply
:
print (B.groupby('b')['MOST_FREQUENT'].apply(lambda x: x.mode())
.reset_index(level=1, drop=True).reset_index())
b MOST_FREQUENT
0 I 2
1 II 1
Another solution is use SeriesGroupBy.value_counts
and return first index
value, because value_counts
sorts values:
print (B.groupby('b')['MOST_FREQUENT'].apply(lambda x: x.value_counts().index[0])
.reset_index())
b MOST_FREQUENT
0 I 2
1 II 1
EDIT: You can use most_common
from collections import Counter
print (B.groupby(['b']).agg(lambda x: Counter(x).most_common(1)[0][0]).reset_index())
b MOST_FREQUENT
0 I 2
1 II 1
Solution 2:
Trying to squeeze a little more performance out of pandas, we can use groupby
with size to get the counts. then use idxmax
to find the index values of the largest sub-groups. These indices will be the values we're looking for.
s = B.groupby(['MOST_FREQUENT', 'b']).size()
pd.DataFrame(
s.groupby(level='b').idxmax().values.tolist(),
columns=s.index.names
)
MOST_FREQUENT b
0 2 I
1 1 II
naive timing
Post a Comment for "Find Most Frequent Observation In Group"