Skip to content Skip to sidebar Skip to footer

Groupby With Sub-ranges In Pandas

I am researching soccer dataset LEAGUE HOME DRAW AWAY WINNER PREDICTED PROFIT 0 2 3.25 3.25 2.10 0 2 -10.0 1 14 1.50 3.

Solution 1:

Maybe you need cut:

bins = np.linspace(0, 5, 20, endpoint=False)
print bins
[ 0.0.250.50.751.1.251.51.752.2.252.52.753.3.253.53.754.4.254.54.75]

print df.groupby([df.LEAGUE, pd.cut(df.HOME, bins)]).sum()
                    HOME  DRAW  AWAY  WINNER  PREDICTED  PROFIT
LEAGUE HOME                                                    
2(0,0.25]NaNNaNNaNNaNNaNNaN(0.25,0.5]NaNNaNNaNNaNNaNNaN(0.5,0.75]NaNNaNNaNNaNNaNNaN(0.75,1]NaNNaNNaNNaNNaNNaN(1,1.25]NaNNaNNaNNaNNaNNaN(1.25,1.5]NaNNaNNaNNaNNaNNaN(1.5,1.75]NaNNaNNaNNaNNaNNaN(1.75,2]NaNNaNNaNNaNNaNNaN(2,2.25]2.253.303.2020-10.0(2.25,2.5]NaNNaNNaNNaNNaNNaN(2.5,2.75]NaNNaNNaNNaNNaNNaN(2.75,3]NaNNaNNaNNaNNaNNaN(3,3.25]3.253.252.1002-10.0(3.25,3.5]NaNNaNNaNNaNNaNNaN(3.5,3.75]NaNNaNNaNNaNNaNNaN(3.75,4]NaNNaNNaNNaNNaNNaN(4,4.25]NaNNaNNaNNaNNaNNaN(4.25,4.5]NaNNaNNaNNaNNaNNaN(4.5,4.75]NaNNaNNaNNaNNaNNaN11(0,0.25]NaNNaNNaNNaNNaNNaN(0.25,0.5]NaNNaNNaNNaNNaNNaN(0.5,0.75]NaNNaNNaNNaNNaNNaN(0.75,1]NaNNaNNaNNaNNaNNaN(1,1.25]NaNNaNNaNNaNNaNNaN(1.25,1.5]NaNNaNNaNNaNNaNNaN(1.5,1.75]NaNNaNNaNNaNNaNNaN(1.75,2]NaNNaNNaNNaNNaNNaN(2,2.25]2.253.002.880012.5(2.25,2.5]NaNNaNNaNNaNNaNNaN(2.5,2.75]NaNNaNNaNNaNNaNNaN
...                  ...   ...   ...     ...        ...     ...
14(2,2.25]NaNNaNNaNNaNNaNNaN(2.25,2.5]NaNNaNNaNNaNNaNNaN(2.5,2.75]NaNNaNNaNNaNNaNNaN(2.75,3]NaNNaNNaNNaNNaNNaN(3,3.25]NaNNaNNaNNaNNaNNaN(3.25,3.5]NaNNaNNaNNaNNaNNaN(3.5,3.75]NaNNaNNaNNaNNaNNaN(3.75,4]NaNNaNNaNNaNNaNNaN(4,4.25]NaNNaNNaNNaNNaNNaN(4.25,4.5]NaNNaNNaNNaNNaNNaN(4.5,4.75]NaNNaNNaNNaNNaNNaN17(0,0.25]NaNNaNNaNNaNNaNNaN(0.25,0.5]NaNNaNNaNNaNNaNNaN(0.5,0.75]NaNNaNNaNNaNNaNNaN(0.75,1]NaNNaNNaNNaNNaNNaN(1,1.25]NaNNaNNaNNaNNaNNaN(1.25,1.5]NaNNaNNaNNaNNaNNaN(1.5,1.75]NaNNaNNaNNaNNaNNaN(1.75,2]NaNNaNNaNNaNNaNNaN(2,2.25]NaNNaNNaNNaNNaNNaN(2.25,2.5]NaNNaNNaNNaNNaNNaN(2.5,2.75]NaNNaNNaNNaNNaNNaN(2.75,3]NaNNaNNaNNaNNaNNaN(3,3.25]NaNNaNNaNNaNNaNNaN(3.25,3.5]NaNNaNNaNNaNNaNNaN(3.5,3.75]NaNNaNNaNNaNNaNNaN(3.75,4]NaNNaNNaNNaNNaNNaN(4,4.25]NaNNaNNaNNaNNaNNaN(4.25,4.5]NaNNaNNaNNaNNaNNaN(4.5,4.75]NaNNaNNaNNaNNaNNaN[76 rows x 6 columns]

EDIT:

You can use agg:

print df.groupby([df.LEAGUE, pd.cut(df.HOME, bins)]).agg({'HOME' : min, 
                                                          'DRAW' : min, 
                                                          'AWAY' : min, 
                                                          'WINNER' : 'count', 
                                                          'PREDICTED' : 'count', 
                                                          'PROFIT': sum})

                    DRAW  PROFIT  AWAY  WINNER  PREDICTED  HOME
LEAGUE HOME                                                    
2      (2, 2.25]    3.30   -10.03.20112.25
       (3, 3.25]    3.25   -10.02.10113.2511     (2, 2.25]    3.0012.52.88112.2514     (1.25, 1.5]  3.505.06.00111.50

Post a Comment for "Groupby With Sub-ranges In Pandas"