Applying Custom Function While Grouping Returns NaN
Given a dict, performances, storing Series of kind: 2015-02-28 NaN 2015-03-02 100.000000 2015-03-03 98.997117 2015-03-04 98.909215 2015-03-05 99.909979 201
Solution 1:
If each month has at least one non NaN
value, use first_valid_index
:
print (df.b.groupby(df.index.month).apply(lambda x: x[x.first_valid_index()]))
More general solution, which return NaN
if all values in some month are NaN
:
def f(x):
if x.first_valid_index() is None:
return np.nan
else:
return x[x.first_valid_index()]
print (df.b.groupby(df.index.month).apply(f))
2 NaN
3 100.0
Name: b, dtype: float64
If you want group by years
and months
use to_period
:
print (df.b.groupby(df.index.to_period('M')).apply(f))
2015-02 NaN
2015-03 100.0
Freq: M, Name: b, dtype: float64
Sample:
import pandas as pd
import numpy as np
df = pd.DataFrame({'b': pd.Series({ pd.Timestamp('2015-07-19 00:00:00'): 102.67248199999999, pd.Timestamp('2015-04-05 00:00:00'): np.nan, pd.Timestamp('2015-02-25 00:00:00'): np.nan, pd.Timestamp('2015-04-09 00:00:00'): 100.50277199999999, pd.Timestamp('2015-06-18 00:00:00'): 102.436339, pd.Timestamp('2015-06-16 00:00:00'): 102.669184, pd.Timestamp('2015-04-10 00:00:00'): 101.68531400000001, pd.Timestamp('2015-05-12 00:00:00'): 102.42723700000001, pd.Timestamp('2015-07-20 00:00:00'): 102.23838600000001, pd.Timestamp('2015-06-17 00:00:00'): np.nan, pd.Timestamp('2015-08-23 00:00:00'): 101.460082, pd.Timestamp('2015-03-03 00:00:00'): 98.997117000000003, pd.Timestamp('2015-03-02 00:00:00'): 100.0, pd.Timestamp('2015-05-11 00:00:00'): 102.518433, pd.Timestamp('2015-03-04 00:00:00'): 98.909215000000003, pd.Timestamp('2015-05-13 00:00:00'): 103.424257, pd.Timestamp('2015-04-06 00:00:00'): np.nan})})
print (df)
b
2015-02-25 NaN
2015-03-02 100.000000
2015-03-03 98.997117
2015-03-04 98.909215
2015-04-05 NaN
2015-04-06 NaN
2015-04-09 100.502772
2015-04-10 101.685314
2015-05-11 102.518433
2015-05-12 102.427237
2015-05-13 103.424257
2015-06-16 102.669184
2015-06-17 NaN
2015-06-18 102.436339
2015-07-19 102.672482
2015-07-20 102.238386
2015-08-23 101.460082
def f(x):
if x.first_valid_index() is None:
return np.nan
else:
return x[x.first_valid_index()]
print (df.b.groupby(df.index.to_period('M')).apply(f))
2015-02 NaN
2015-03 100.000000
2015-04 100.502772
2015-05 102.518433
2015-06 102.669184
2015-07 102.672482
2015-08 101.460082
Freq: M, Name: b, dtype: float64
Post a Comment for "Applying Custom Function While Grouping Returns NaN"