Skip to content Skip to sidebar Skip to footer

Applying Custom Function While Grouping Returns Nan

Given a dict, performances, storing Series of kind: 2015-02-28 NaN 2015-03-02 100.000000 2015-03-03 98.997117 2015-03-04 98.909215 2015-03-05 99.909979 201

Solution 1:

If each month has at least one non NaN value, use first_valid_index:

print (df.b.groupby(df.index.month).apply(lambda x: x[x.first_valid_index()]))

More general solution, which return NaN if all values in some month are NaN:

deff(x):
    if x.first_valid_index() isNone:
        return np.nan
    else:
        return x[x.first_valid_index()]

print (df.b.groupby(df.index.month).apply(f))

2      NaN
3100.0
Name: b, dtype: float64

If you want group by years and months use to_period:

print(df.b.groupby(df.index.to_period('M')).apply(f))2015-02      NaN2015-03    100.0Freq:M,Name:b,dtype:float64

Sample:

import pandas as pd
import numpy as np

df = pd.DataFrame({'b': pd.Series({ pd.Timestamp('2015-07-19 00:00:00'): 102.67248199999999,  pd.Timestamp('2015-04-05 00:00:00'):  np.nan,  pd.Timestamp('2015-02-25 00:00:00'):  np.nan,  pd.Timestamp('2015-04-09 00:00:00'): 100.50277199999999,  pd.Timestamp('2015-06-18 00:00:00'): 102.436339,  pd.Timestamp('2015-06-16 00:00:00'): 102.669184,  pd.Timestamp('2015-04-10 00:00:00'): 101.68531400000001,  pd.Timestamp('2015-05-12 00:00:00'): 102.42723700000001,  pd.Timestamp('2015-07-20 00:00:00'): 102.23838600000001,  pd.Timestamp('2015-06-17 00:00:00'):  np.nan,  pd.Timestamp('2015-08-23 00:00:00'): 101.460082,  pd.Timestamp('2015-03-03 00:00:00'): 98.997117000000003,  pd.Timestamp('2015-03-02 00:00:00'): 100.0,  pd.Timestamp('2015-05-11 00:00:00'): 102.518433,  pd.Timestamp('2015-03-04 00:00:00'): 98.909215000000003, pd.Timestamp('2015-05-13 00:00:00'): 103.424257,  pd.Timestamp('2015-04-06 00:00:00'):  np.nan})})
print(df)b2015-02-25         NaN2015-03-02  100.0000002015-03-03   98.9971172015-03-04   98.9092152015-04-05         NaN2015-04-06         NaN2015-04-09  100.5027722015-04-10  101.6853142015-05-11  102.5184332015-05-12  102.4272372015-05-13  103.4242572015-06-16  102.6691842015-06-17         NaN2015-06-18  102.4363392015-07-19  102.6724822015-07-20  102.2383862015-08-23  101.460082
deff(x):ifx.first_valid_index()is None:returnnp.nanelse:returnx[x.first_valid_index()]print(df.b.groupby(df.index.to_period('M')).apply(f))2015-02           NaN2015-03    100.0000002015-04    100.5027722015-05    102.5184332015-06    102.6691842015-07    102.6724822015-08    101.460082Freq:M,Name:b,dtype:float64

Post a Comment for "Applying Custom Function While Grouping Returns Nan"