Python Pandas – Groupby Apply or Aggregate with Custom Function with Multiple Inputs

aggregate-functionsapplygroup-bypandaspython

I want to apply custom functions to pandas groupby function.

I was able to apply when my custom function has only 1 input which is the grouped value.

I have dataframe like this:

a     b     c      value
a1    b1    c1      v1
a2    b2    c2      v2
a3    b3    c3      v3

Appliable version:

def cpk(a):
    arr = np.asarray(a)
    arr = arr.ravel()
    sigma = np.std(arr)
    m = np.mean(arr)

    Cpu = float(150 - m) / (3*sigma)
    Cpl = float(m - 50) / (3*sigma)
    Cpk = np.min([Cpu, Cpl])
    return Cpk


df_cpk = df_result.groupby(['a','b','c'])['value'].agg(cpk).reset_index()

As you can see in the above code, the grouped 'value' automatically go to the input of the cpk function.

What I want to know is how to apply below function:

def cpk2(a,lsl,usl):
    arr = np.asarray(a)
    arr = arr.ravel()
    sigma = np.std(arr)
    m = np.mean(arr)

    Cpu = float(usl - m) / (3*sigma)
    Cpl = float(m - lsl) / (3*sigma)
    Cpk = np.min([Cpu, Cpl])
    return Cpk

# df_cpk = df_result.groupby(['a','b','c'])['value'].agg(cpk2(?,?,?)).reset_index()

Where there are multiple inputs to the function, one being the group values.
Is there any simple way to do it?

Best Answer

Since the two other inputs are constants, you can simply use a lambda expression:

df_cpk = df.groupby(['a','b','c'])['value'].agg(lambda x: cpk2(x, 50, 150)).reset_index()