Get Lowest Value After Groupby - Pandas
Solution 1:
You need DataFrameGroupBy.idxmin for indexes of minimal Distance per group and then select rows by loc:
df1 = df.loc[df.groupby('City', sort=False)['Distance'].idxmin()]
print (df1)
City Distance
0 London 5
1 Paris 1
3 NY 2
Detail:
print (df.groupby('City', sort=False)['Distance'].idxmin())
City
London 0
Paris 1
NY 3
Name: Distance, dtype: int64
Solution 2:
Sometime groupby is unnecessary, try drop_duplicates
df.sort_values('Distance').drop_duplicates('City')
Out[377]:
CityDistance0London51Paris13NY2Solution 3:
You can use
>>>df.groupby(['City'], sort=False)['Distance'].min()
City
London 5
Paris 1
NY 2
Name: Distance, dtype: int64
Solution 4:
My opinion is that @jezrael offers the most idiomatic approach within a groupby. I've offered the same solution myself on other answers. However, here are some other alternatives.
Option 1
Use pd.DataFrame.nsmallest within an apply
This offers clean logic even if the api is a bit clumsy. I think this version of nsmallest should be available to the groupby object. But as of pandas 0.20.3, it is not. So we use it within the general purpose apply method. Make sure to use group_keys=False in the call to groupby in order to avoid awkward additional indices.
df.groupby('City', group_keys=False).apply(
lambda d: d.nsmallest(1, columns='Distance'))
City Distance
0 London 53 NY 21 Paris 1Option 2 Was taken by @Wen so I deleted.
Post a Comment for "Get Lowest Value After Groupby - Pandas"