Pandas 通過 Groupby 應用變換

Fariba Laiq 2024年2月15日 Pandas Pandas Groupby

Python 中 apply() 和 transform() 的區別
在 Python Pandas 中使用 apply() 方法
在 Python Pandas 中使用 transform() 方法

groupby() 是 Python 中一個強大的方法，它允許我們根據某些標準將資料分成不同的組。目的是執行計算並執行更好的分析。

Python 中 `apply()` 和 `transform()` 的區別

apply() 和 transform() 是與 groupby() 方法呼叫結合使用的兩種方法。這兩種方法的區別在於傳遞的引數和返回的值。

apply() 方法接受引數作為 DataFrame 並返回 DataFrame 的標量 或序列。因此，它允許我們對每個組的列、行和完整的 DataFrame 進行操作。

transform() 方法僅接受引數作為表示每個組中的列的系列，並返回與輸入系列長度相同的序列。因此，我們一次只能對每個組內的特定列進行操作。

在 Python Pandas 中使用 `apply()` 方法

在以下程式碼中，我們載入了一個包含學生記錄的 CSV 檔案。我們使用 apply 函式來顯示每個部門中的最高分數。

首先，我們必須使用 groupby() 方法對每個部門進行分組。然後使用 max() 函式找到每個部門的最高分。

輸出以系列的形式返回。我們還可以對多列或整個 DataFrame 執行操作。

# Python 3.x
import pandas as pd

df = pd.read_csv("Student.csv")
display(df)


def f(my_df):
    return my_df.Marks.max()


df.groupby("Department").apply(f)

輸出：

在 Python Pandas 中使用 groupby()_apply()

在 Python Pandas 中使用 `transform()` 方法

在下一個示例中，我們通過使用 groupby() 方法將每個部門分組，將另一列 Mean_Marks 合併到 DataFrame 中，然後使用 mean 關鍵字計算兩個部門的平均值。

輸出顯示兩個部門的平均分數。

在這裡，transform() 方法在單個列上執行，在我們的例子中是 Marks。

# Python 3.x
import pandas as pd

df = pd.read_csv("Student.csv")
display(df)
df["Mean_Marks"] = df.groupby("Department")["Marks"].transform("mean")
display(df)

輸出：

在 Python Pandas 中使用 groupby()_transform()

Enjoying our tutorials? Subscribe to DelftStack on YouTube to support us in creating more high-quality video guides. Subscribe

作者： Fariba Laiq

I am Fariba Laiq from Pakistan. An android app developer, technical content writer, and coding instructor. Writing has always been one of my passions. I love to learn, implement and convey my knowledge to others.

Python 中 apply() 和 transform() 的區別

在 Python Pandas 中使用 apply() 方法

在 Python Pandas 中使用 transform() 方法

相關文章 - Pandas Groupby

Python 中 `apply()` 和 `transform()` 的區別

在 Python Pandas 中使用 `apply()` 方法

在 Python Pandas 中使用 `transform()` 方法