Pandas loc vs iloc
-
Select Particular Value From DataFrame Specifying Index and Column Label Using
.loc()
Method -
Select Particular Columns From the DataFrame Using the
.loc()
Method -
Filter Rows by Applying Condition to Columns Using
.loc()
Method -
Filter Rows With Indices Using
iloc
- Filter Particular Rows and Columns From the DataFrame
-
Filter Range of Rows and Columns From DataFrame Using
iloc
-
Pandas
loc
vsiloc
This tutorial explains how we can filter data from a Pandas DataFrame using loc
and iloc
in Python. To filter entries from the DataFrame using iloc
we use the integer index for rows and columns, and to filter entries from the DataFrame using loc
, we use row and column names.
To demonstrate data filtering using loc
, we will use the DataFrame described in the following example.
import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.DataFrame(
{
"Name": ["Alice", "Steven", "Neesham", "Chris", "Alice"],
"Age": [17, 20, 18, 21, 15],
"City": ["New York", "Portland", "Boston", "Seattle", "Austin"],
"Grade": ["A", "B-", "B+", "A-", "A"],
},
index=roll_no,
)
print(student_df)
Output:
Name Age City Grade
501 Alice 17 New York A
502 Steven 20 Portland B-
503 Neesham 18 Boston B+
504 Chris 21 Seattle A-
505 Alice 15 Austin A
Select Particular Value From DataFrame Specifying Index and Column Label Using .loc()
Method
We can pass an index label and column label as an argument to the .loc()
method to extract the value corresponding to the given index and column label.
import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.DataFrame(
{
"Name": ["Alice", "Steven", "Neesham", "Chris", "Alice"],
"Age": [17, 20, 18, 21, 15],
"City": ["New York", "Portland", "Boston", "Seattle", "Austin"],
"Grade": ["A", "B-", "B+", "A-", "A"],
},
index=roll_no,
)
print("The DataFrame of students with marks is:")
print(student_df)
print("")
print("The Grade of student with Roll No. 504 is:")
value = student_df.loc[504, "Grade"]
print(value)
Output:
The DataFrame of students with marks is:
Name Age City Grade
501 Alice 17 New York A
502 Steven 20 Portland B-
503 Neesham 18 Boston B+
504 Chris 21 Seattle A-
505 Alice 15 Austin A
The Grade of student with Roll No. 504 is:
A-
It selects the value in the DataFrame with index label as 504
and column label Grade
. The first argument to the .loc()
method represents the index name, while the second argument refers to the column name.
Select Particular Columns From the DataFrame Using the .loc()
Method
We can also filter the required columns from the DataFrame using the .loc()
method. We pass the list of required column names as a second argument to the .loc()
method to filter specified columns.
import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.DataFrame(
{
"Name": ["Alice", "Steven", "Neesham", "Chris", "Alice"],
"Age": [17, 20, 18, 21, 15],
"City": ["New York", "Portland", "Boston", "Seattle", "Austin"],
"Grade": ["A", "B-", "B+", "A-", "A"],
},
index=roll_no,
)
print("The DataFrame of students with marks is:")
print(student_df)
print("")
print("The name and age of students in the DataFrame are:")
value = student_df.loc[:, ["Name", "Age"]]
print(value)
Output:
The DataFrame of students with marks is:
Name Age City Grade
501 Alice 17 New York A
502 Steven 20 Portland B-
503 Neesham 18 Boston B+
504 Chris 21 Seattle A-
505 Alice 15 Austin A
The name and age of students in the DataFrame are:
Name Age
501 Alice 17
502 Steven 20
503 Neesham 18
504 Chris 21
505 Alice 15
The first argument to the .loc()
is :
, which denotes all the rows in the DataFrame. Similarly we pass ["Name", "Age"]
as the second argument to the .loc()
method which represents to select only Name
and Age
columns from the DataFrame.
Filter Rows by Applying Condition to Columns Using .loc()
Method
We can also filter rows satisfying the specified condition for column values using the .loc()
method.
import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.DataFrame(
{
"Name": ["Alice", "Steven", "Neesham", "Chris", "Alice"],
"Age": [17, 20, 18, 21, 15],
"City": ["New York", "Portland", "Boston", "Seattle", "Austin"],
"Grade": ["A", "B-", "B+", "A-", "A"],
},
index=roll_no,
)
print("The DataFrame of students with marks is:")
print(student_df)
print("")
print("Students with Grade A are:")
value = student_df.loc[student_df.Grade == "A"]
print(value)
Output:
The DataFrame of students with marks is:
Name Age City Grade
501 Alice 17 New York A
502 Steven 20 Portland B-
503 Neesham 18 Boston B+
504 Chris 21 Seattle A-
505 Alice 15 Austin A
Students with Grade A are:
Name Age City Grade
501 Alice 17 New York A
505 Alice 15 Austin A
It selects all the students in the DataFrame with grade A
.
Filter Rows With Indices Using iloc
import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.DataFrame(
{
"Name": ["Alice", "Steven", "Neesham", "Chris", "Alice"],
"Age": [17, 20, 18, 21, 15],
"City": ["New York", "Portland", "Boston", "Seattle", "Austin"],
"Grade": ["A", "B-", "B+", "A-", "A"],
},
index=roll_no,
)
print("The DataFrame of students with marks is:")
print(student_df)
print("")
print("2nd and 3rd rows in the DataFrame:")
filtered_rows = student_df.iloc[[1, 2]]
print(filtered_rows)
Output:
The DataFrame of students with marks is:
Name Age City Grade
501 Alice 17 New York A
502 Steven 20 Portland B-
503 Neesham 18 Boston B+
504 Chris 21 Seattle A-
505 Alice 15 Austin A
2nd and 3rd rows in the DataFrame:
Name Age City Grade
502 Steven 20 Portland B-
503 Neesham 18 Boston B+
It filters the second and third rows from the DataFrame.
We pass the rows’ integer index as an argument to the iloc
method to filter rows from the DataFrame. Here, the integer index for the second and third rows are 1
and 2
respectively, as the index starts from 0
.
Filter Particular Rows and Columns From the DataFrame
import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.DataFrame(
{
"Name": ["Alice", "Steven", "Neesham", "Chris", "Alice"],
"Age": [17, 20, 18, 21, 15],
"City": ["New York", "Portland", "Boston", "Seattle", "Austin"],
"Grade": ["A", "B-", "B+", "A-", "A"],
},
index=roll_no,
)
print("The DataFrame of students with marks is:")
print(student_df)
print("")
print("Filtered values from the DataFrame:")
filtered_values = student_df.iloc[[1, 2, 3], [0, 3]]
print(filtered_values)
Output:
The DataFrame of students with marks is:
Name Age City Grade
501 Alice 17 New York A
502 Steven 20 Portland B-
503 Neesham 18 Boston B+
504 Chris 21 Seattle A-
505 Alice 15 Austin A
Filtered values from the DataFrame:
Name Grade
502 Steven B-
503 Neesham B+
504 Chris A-
It filters the first and last column i.e. Name
and Grade
of the second, third and fourth row from the DataFrame. We pass the list with integer indices of the row as the first argument and the list with integer indices of the column as the second argument to the iloc
method.
Filter Range of Rows and Columns From DataFrame Using iloc
To filter the range of rows and columns, we can use list slicing and pass the slices for each row and column as an argument to the iloc
method.
import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.DataFrame(
{
"Name": ["Alice", "Steven", "Neesham", "Chris", "Alice"],
"Age": [17, 20, 18, 21, 15],
"City": ["New York", "Portland", "Boston", "Seattle", "Austin"],
"Grade": ["A", "B-", "B+", "A-", "A"],
},
index=roll_no,
)
print("The DataFrame of students with marks is:")
print(student_df)
print("")
print("Filtered values from the DataFrame:")
filtered_values = student_df.iloc[1:4, 0:2]
print(filtered_values)
Output:
The DataFrame of students with marks is:
Name Age City Grade
501 Alice 17 New York A
502 Steven 20 Portland B-
503 Neesham 18 Boston B+
504 Chris 21 Seattle A-
505 Alice 15 Austin A
Filtered values from the DataFrame:
Name Age
502 Steven 20
503 Neesham 18
504 Chris 21
It selects the second, third and fourth rows and the first and second columns from the DataFrame. 1:4
represents the rows with an index ranging from 1
to 3
and 4
is exclusive in the range. Similarly, 0:2
represents columns with an index ranging from 0
to 1
.
Pandas loc
vs iloc
To filter the rows
and columns
from the DataFrame using loc()
, we need to pass the name of rows and columns to be filtered out. Similarly, we need to pass the integer indices of rows
and columns
to be filtered out to filter the values using iloc()
.
import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.DataFrame(
{
"Name": ["Alice", "Steven", "Neesham", "Chris", "Alice"],
"Age": [17, 20, 18, 21, 15],
"City": ["New York", "Portland", "Boston", "Seattle", "Austin"],
"Grade": ["A", "B-", "B+", "A-", "A"],
},
index=roll_no,
)
print("The DataFrame of students with marks is:")
print(student_df)
print("")
print("Filtered values from the DataFrame using loc:")
iloc_filtered_values = student_df.loc[[502, 503, 504], ["Name", "Age"]]
print(iloc_filtered_values)
print("")
print("Filtered values from the DataFrame using iloc:")
iloc_filtered_values = student_df.iloc[[1, 2, 3], [0, 3]]
print(iloc_filtered_values)
The DataFrame of students with marks is:
Name Age City Grade
501 Alice 17 New York A
502 Steven 20 Portland B-
503 Neesham 18 Boston B+
504 Chris 21 Seattle A-
505 Alice 15 Austin A
Filtered values from the DataFrame using loc:
Name Age
502 Steven 20
503 Neesham 18
504 Chris 21
Filtered values from the DataFrame using iloc:
Name Grade
502 Steven B-
503 Neesham B+
504 Chris A-
It displays how we can filter the same values from DataFrame using loc
and iloc
.
Suraj Joshi is a backend software engineer at Matrice.ai.
LinkedIn