Get Lowest Value After Groupby - Pandas
Solution 1:
You need DataFrameGroupBy.idxmin
for indexes of minimal Distance
per group and then select rows by loc
:
df1 = df.loc[df.groupby('City', sort=False)['Distance'].idxmin()]
print (df1)
City Distance
0 London 5
1 Paris 1
3 NY 2
Detail:
print (df.groupby('City', sort=False)['Distance'].idxmin())
City
London 0
Paris 1
NY 3
Name: Distance, dtype: int64
Solution 2:
Sometime groupby
is unnecessary, try drop_duplicates
df.sort_values('Distance').drop_duplicates('City')
Out[377]:
CityDistance0London51Paris13NY2
Solution 3:
You can use
>>>df.groupby(['City'], sort=False)['Distance'].min()
City
London 5
Paris 1
NY 2
Name: Distance, dtype: int64
Solution 4:
My opinion is that @jezrael offers the most idiomatic approach within a groupby
. I've offered the same solution myself on other answers. However, here are some other alternatives.
Option 1
Use pd.DataFrame.nsmallest
within an apply
This offers clean logic even if the api is a bit clumsy. I think this version of nsmallest
should be available to the groupby
object. But as of pandas 0.20.3, it is not. So we use it within the general purpose apply
method. Make sure to use group_keys=False
in the call to groupby
in order to avoid awkward additional indices.
df.groupby('City', group_keys=False).apply(
lambda d: d.nsmallest(1, columns='Distance'))
City Distance
0 London 53 NY 21 Paris 1
Option 2 Was taken by @Wen so I deleted.
Post a Comment for "Get Lowest Value After Groupby - Pandas"