Pandas loc vs iloc

Suraj Joshi 2023年1月30日 Pandas Pandas Filter

使用 .loc() 方法從 DataFrame 中選擇指定索引和列標籤的特定值
使用 .loc() 方法從 DataFrame 中選擇特定的列
使用 .loc() 方法通過對列應用條件來過濾行
使用 iloc 通過索引來過濾行
從 DataFrame 中過濾特定的行和列
使用 iloc 方法從 DataFrame 中過濾行和列的範圍
Pandas loc 與 iloc 的比較

本教程介紹瞭如何使用 Python 中的 loc 和 iloc 從 Pandas DataFrame 中過濾資料。要使用 iloc 從 DataFrame 中過濾元素，我們使用行和列的整數索引，而要使用 loc 從 DataFrame 中過濾元素，我們使用行名和列名。

為了演示使用 loc 的資料過濾，我們將使用下面例子中描述的 DataFrame。

import pandas as pd

roll_no = [501, 502, 503, 504, 505]

student_df = pd.DataFrame(
    {
        "Name": ["Alice", "Steven", "Neesham", "Chris", "Alice"],
        "Age": [17, 20, 18, 21, 15],
        "City": ["New York", "Portland", "Boston", "Seattle", "Austin"],
        "Grade": ["A", "B-", "B+", "A-", "A"],
    },
    index=roll_no,
)

print(student_df)

輸出：

        Name  Age      City Grade
501    Alice   17  New York     A
502   Steven   20  Portland    B-
503  Neesham   18    Boston    B+
504    Chris   21   Seattle    A-
505    Alice   15    Austin     A

使用 `.loc()` 方法從 DataFrame 中選擇指定索引和列標籤的特定值

我們可以將索引標籤和列標籤作為引數傳遞給 .loc() 方法，以提取給定索引和列標籤對應的值。

import pandas as pd

roll_no = [501, 502, 503, 504, 505]

student_df = pd.DataFrame(
    {
        "Name": ["Alice", "Steven", "Neesham", "Chris", "Alice"],
        "Age": [17, 20, 18, 21, 15],
        "City": ["New York", "Portland", "Boston", "Seattle", "Austin"],
        "Grade": ["A", "B-", "B+", "A-", "A"],
    },
    index=roll_no,
)
print("The DataFrame of students with marks is:")
print(student_df)
print("")
print("The Grade of student with Roll No. 504 is:")
value = student_df.loc[504, "Grade"]
print(value)

輸出：

The DataFrame of students with marks is:
        Name  Age      City Grade
501    Alice   17  New York     A
502   Steven   20  Portland    B-
503  Neesham   18    Boston    B+
504    Chris   21   Seattle    A-
505    Alice   15    Austin     A

The Grade of student with Roll No. 504 is:
A-

在 DataFrame 中選擇索引標籤為 504 且列標籤為 Grade 的值。.loc() 方法的第一個引數代表索引名，第二個引數是指列名。

使用 `.loc()` 方法從 DataFrame 中選擇特定的列

我們還可以使用 .loc() 方法從 DataFrame 中過濾所需的列。我們將所需的列名列表作為第二個引數傳遞給 .loc() 方法來過濾指定的列。

import pandas as pd

roll_no = [501, 502, 503, 504, 505]

student_df = pd.DataFrame(
    {
        "Name": ["Alice", "Steven", "Neesham", "Chris", "Alice"],
        "Age": [17, 20, 18, 21, 15],
        "City": ["New York", "Portland", "Boston", "Seattle", "Austin"],
        "Grade": ["A", "B-", "B+", "A-", "A"],
    },
    index=roll_no,
)

print("The DataFrame of students with marks is:")
print(student_df)
print("")
print("The name and age of students in the DataFrame are:")
value = student_df.loc[:, ["Name", "Age"]]
print(value)

輸出：

The DataFrame of students with marks is:
        Name Age      City Grade
501    Alice   17 New York     A
502   Steven   20 Portland    B-
503 Neesham   18    Boston    B+
504    Chris   21   Seattle    A-
505    Alice   15    Austin     A

The name and age of students in the DataFrame are:
        Name Age
501    Alice   17
502   Steven   20
503 Neesham   18
504    Chris   21
505    Alice   15

.loc() 的第一個引數是:，它表示 DataFrame 中的所有行。同樣，我們將 ["Name", "Age"] 作為第二個引數傳遞給 .loc() 方法，表示只選擇 DataFrame 中的 Name 和 Age 列。

使用 `.loc()` 方法通過對列應用條件來過濾行

我們也可以使用 .loc() 方法過濾滿足指定條件的列值的行。

import pandas as pd

roll_no = [501, 502, 503, 504, 505]

student_df = pd.DataFrame(
    {
        "Name": ["Alice", "Steven", "Neesham", "Chris", "Alice"],
        "Age": [17, 20, 18, 21, 15],
        "City": ["New York", "Portland", "Boston", "Seattle", "Austin"],
        "Grade": ["A", "B-", "B+", "A-", "A"],
    },
    index=roll_no,
)

print("The DataFrame of students with marks is:")
print(student_df)
print("")
print("Students with Grade A are:")
value = student_df.loc[student_df.Grade == "A"]
print(value)

輸出：

The DataFrame of students with marks is:
        Name Age      City Grade
501    Alice   17 New York     A
502   Steven   20 Portland    B-
503 Neesham   18    Boston    B+
504    Chris   21   Seattle    A-
505    Alice   15    Austin     A

Students with Grade A are:
      Name Age      City Grade
501 Alice   17 New York     A
505 Alice   15    Austin     A

它選擇了 DataFrame 中所有成績為 A 的學生。

使用 `iloc` 通過索引來過濾行

import pandas as pd

roll_no = [501, 502, 503, 504, 505]

student_df = pd.DataFrame(
    {
        "Name": ["Alice", "Steven", "Neesham", "Chris", "Alice"],
        "Age": [17, 20, 18, 21, 15],
        "City": ["New York", "Portland", "Boston", "Seattle", "Austin"],
        "Grade": ["A", "B-", "B+", "A-", "A"],
    },
    index=roll_no,
)

print("The DataFrame of students with marks is:")
print(student_df)
print("")
print("2nd and 3rd rows in the DataFrame:")
filtered_rows = student_df.iloc[[1, 2]]
print(filtered_rows)

輸出：

The DataFrame of students with marks is:
        Name  Age      City Grade
501    Alice   17  New York     A
502   Steven   20  Portland    B-
503  Neesham   18    Boston    B+
504    Chris   21   Seattle    A-
505    Alice   15    Austin     A

2nd and 3rd rows in the DataFrame:
        Name  Age      City Grade
502   Steven   20  Portland    B-
503  Neesham   18    Boston    B+

它從 DataFrame 中過濾第 2 和第 3 行。

我們將行的整數索引作為引數傳遞給 iloc 方法，以便從 DataFrame 中過濾行。在這裡，第二和第三行的整數索引分別是 1 和 2，因為索引從 0 開始。

從 DataFrame 中過濾特定的行和列

import pandas as pd

roll_no = [501, 502, 503, 504, 505]

student_df = pd.DataFrame(
    {
        "Name": ["Alice", "Steven", "Neesham", "Chris", "Alice"],
        "Age": [17, 20, 18, 21, 15],
        "City": ["New York", "Portland", "Boston", "Seattle", "Austin"],
        "Grade": ["A", "B-", "B+", "A-", "A"],
    },
    index=roll_no,
)

print("The DataFrame of students with marks is:")
print(student_df)
print("")
print("Filtered values from the DataFrame:")
filtered_values = student_df.iloc[[1, 2, 3], [0, 3]]
print(filtered_values)

輸出：

The DataFrame of students with marks is:
        Name  Age      City Grade
501    Alice   17  New York     A
502   Steven   20  Portland    B-
503  Neesham   18    Boston    B+
504    Chris   21   Seattle    A-
505    Alice   15    Austin     A

Filtered values from the DataFrame:
        Name Grade
502   Steven    B-
503  Neesham    B+
504    Chris    A-

它從 DataFrame 中過濾第 2、3、4 行的第一列和最後一列，即 Name 和 Grade。我們將行的整數索引列表作為第一個引數，列的整數索引列表作為第二個引數傳遞給 iloc 方法。

使用 `iloc` 方法從 DataFrame 中過濾行和列的範圍

為了過濾行和列的範圍，我們可以使用列表切片，並將每行和每列的切片作為引數傳遞給 iloc 方法。

import pandas as pd

roll_no = [501, 502, 503, 504, 505]

student_df = pd.DataFrame(
    {
        "Name": ["Alice", "Steven", "Neesham", "Chris", "Alice"],
        "Age": [17, 20, 18, 21, 15],
        "City": ["New York", "Portland", "Boston", "Seattle", "Austin"],
        "Grade": ["A", "B-", "B+", "A-", "A"],
    },
    index=roll_no,
)

print("The DataFrame of students with marks is:")
print(student_df)
print("")
print("Filtered values from the DataFrame:")
filtered_values = student_df.iloc[1:4, 0:2]
print(filtered_values)

輸出：

The DataFrame of students with marks is:
        Name  Age      City Grade
501    Alice   17  New York     A
502   Steven   20  Portland    B-
503  Neesham   18    Boston    B+
504    Chris   21   Seattle    A-
505    Alice   15    Austin     A

Filtered values from the DataFrame:
        Name  Age
502   Steven   20
503  Neesham   18
504    Chris   21

它從 DataFrame 中選擇第 2、3、4 行和第 1、2 列。1:4 代表索引範圍從 1 到 3 的行，4 在範圍內是排他性的。同理，0:2 代表索引範圍從 0 到 1 的列。

Pandas `loc` 與 `iloc` 的比較

要使用 loc() 從 DataFrame 中過濾行和列，我們需要傳遞要過濾掉的行和列的名稱。同樣，我們需要傳遞要過濾掉的行和列的整數索引以使用 iloc() 來過濾值。

import pandas as pd

roll_no = [501, 502, 503, 504, 505]

student_df = pd.DataFrame(
    {
        "Name": ["Alice", "Steven", "Neesham", "Chris", "Alice"],
        "Age": [17, 20, 18, 21, 15],
        "City": ["New York", "Portland", "Boston", "Seattle", "Austin"],
        "Grade": ["A", "B-", "B+", "A-", "A"],
    },
    index=roll_no,
)

print("The DataFrame of students with marks is:")
print(student_df)
print("")
print("Filtered values from the DataFrame using loc:")
iloc_filtered_values = student_df.loc[[502, 503, 504], ["Name", "Age"]]
print(iloc_filtered_values)
print("")
print("Filtered values from the DataFrame using iloc:")
iloc_filtered_values = student_df.iloc[[1, 2, 3], [0, 3]]
print(iloc_filtered_values)

The DataFrame of students with marks is:
        Name  Age      City Grade
501    Alice   17  New York     A
502   Steven   20  Portland    B-
503  Neesham   18    Boston    B+
504    Chris   21   Seattle    A-
505    Alice   15    Austin     A

Filtered values from the DataFrame using loc:
        Name  Age
502   Steven   20
503  Neesham   18
504    Chris   21

Filtered values from the DataFrame using iloc:
        Name Grade
502   Steven    B-
503  Neesham    B+
504    Chris    A-

它顯示了我們如何使用 loc 和 iloc 從 DataFrame 中過濾相同的值。

Enjoying our tutorials? Subscribe to DelftStack on YouTube to support us in creating more high-quality video guides. Subscribe

作者： Suraj Joshi

Suraj Joshi is a backend software engineer at Matrice.ai.

使用 .loc() 方法從 DataFrame 中選擇指定索引和列標籤的特定值

使用 .loc() 方法從 DataFrame 中選擇特定的列

使用 .loc() 方法通過對列應用條件來過濾行

使用 iloc 通過索引來過濾行