There are two major differences between the
transform
andapply
groupby methods.
apply
implicitly passes all the columns for each group as a DataFrame to the custom function, whiletransform
passes each column for each group as a Series to the custom function- The custom function passed to
apply
can return a scalar, or a Series or DataFrame (or numpy array or even list). The custom function passed totransform
must return a sequence (a one dimensional Series, array or list) the same length as the group.(transform必須返回與組合相同長度的序列(一維的序列、數組或列表))So,
transform
works on just one Series at a time andapply
works on the entire DataFrame at once.
from :https://stackoverflow.com/questions/27517425/apply-vs-transform-on-a-group-object#
transform 函數:
1.只允許在同一時間在一個Series上進行一次轉換,如果定義列‘a’ 減去列‘b’, 則會出現異常;
2.必須返回與 group相同的單個維度的序列(行)
3. 返回單個標量對象也可以使用,如 . transform(sum)
apply函數:
1. 不同於transform只允許在Series上進行一次轉換, apply對整個DataFrame 作用
2.apply隱式地將group 上所有的列作為自定義函數
栗子:
返回單個標量可以使用transform:
:我們可以看到使用transform 和apply 的輸出結果形式是不一樣的,transform返回與數據同樣長度的行,而apply則進行了聚合
此時,使用apply說明的信息更明確
The other difference is that
transform
must return a single dimensional sequence the same size as the group. In this particular instance, each group has two rows, sotransform
must return a sequence of two rows. If it does not then an error is raised:
栗子2:
The function passed to
transform
must return a number, a row, or the same shape as the argument. if it's a number then the number will be set to all the elements in the group, if it's a row, it will be broadcasted to all the rows in the group.函數傳遞給
transform
必須返回一個數字,一行,或者與參數相同的形狀。 如果是一個數字,那么數字將被設置為組中的所有元素,如果是一行,它將會被廣播到組中的所有行。
參考:https://stackoverflow.com/questions/27517425/apply-vs-transform-on-a-group-object#