Python Pandas Group By – Using Multiple Functions in a Group By with Pandas

aggregategroup-bypandaspython

My data has ages, and also payments per month.

I'm trying to aggregate summing the payments, but without summing the ages (averaging would work).

Is it possible to use different functions for different columns?

Best Answer

You can pass a dictionary to agg with column names as keys and the functions you want as values.

import pandas as pd
import numpy as np

# Create some randomised data
N = 20
date_range = pd.date_range('01/01/2015', periods=N, freq='W')
df = pd.DataFrame({'ages':np.arange(N), 'payments':np.arange(N)*10}, index=date_range)

print(df.head())
#             ages  payments
# 2015-01-04     0         0
# 2015-01-11     1        10
# 2015-01-18     2        20
# 2015-01-25     3        30
# 2015-02-01     4        40

# Apply np.mean to the ages column and np.sum to the payments.
agg_funcs = {'ages':np.mean, 'payments':np.sum}

# Groupby each individual month and then apply the funcs in agg_funcs
grouped = df.groupby(df.index.to_period('M')).agg(agg_funcs)

print(grouped)
#          ages  payments
# 2015-01   1.5        60
# 2015-02   5.5       220
# 2015-03  10.0       500
# 2015-04  14.5       580
# 2015-05  18.0       540
Related Question