Skip to content Skip to sidebar Skip to footer

Groupby Certain Number Of Rows Pandas

I have a dataframe with let's say 2 columns: dates and doubles 2017-05-01 2.5 2017-05-02 3.5 ... ... 2017-05-17 0.2 2017-05-18 2.5 Now I would like to do a g

Solution 1:

I guess you are looking for resample. consider this dataframe

rng = pd.date_range('2017-05-01', periods=18, freq='D')
num = np.random.randint(5,size = 18)
df = pd.DataFrame({'date': rng, 'val': num})

df.resample('6D', on = 'date').sum().reset_index()

will return

dateval02017-05-01  1412017-05-07  1122017-05-13  16

Solution 2:

This is alternative solution using groupby range of length of the dataframe.

Two columns using agg

df.groupby(np.arange(len(df))//6).agg(lambda x: {'date': x.date.iloc[0], 
                                                 'value': x.value.sum()})

Multiple columns you can use first (or last) for date and sum for other columns.

group = df.groupby(np.arange(len(df))//6)
pd.concat((group['date'].first(), 
           group[[c for c in df.columns if c != 'date']].sum()), axis=1)

Post a Comment for "Groupby Certain Number Of Rows Pandas"