How to Use the isin() Function in Pandas DataFrame
-
Use the
isin()
Function to Create DataFrame From Dictionary Objects in Pandas -
Use the
isin()
Function to Filter Pandas DataFrame -
Use the
isin()
Method to Filter Multiple Columns in Pandas Dataframe -
Use the
isin()
Method WithNot (~)
Matching Condition to Filter Pandas Dataframe
We will discuss in this tutorial how to use the like SQL IN
and Not IN
operators to filter pandas DataFrame
. Moreover, we will also show you how to filter a single row/column, filter multiple columns, filter pandas DataFrame
based on conditions using the isin()
function and unary operator (~)
with the help of various examples.
Use the isin()
Function to Create DataFrame From Dictionary Objects in Pandas
The below example DataFrame contains the columns Student Name
, Subject
, Semester
, Marks
. Import pandas library and create a DataFrame.
import pandas as pd
student_record = {
"Student Name": ["Samreena", "Affan", "Mirha", "Asif"],
"Subject": ["SDA", "Ethics", "Web Design", "Web Development"],
"Semester": ["6th", "7th", "5th", "8th"],
"Marks": [100, 90, 80, 70],
}
index_labels = [0, 1, 2, 3]
df = pd.DataFrame(student_record, index=index_labels)
print(df)
Output:
Student Name Subject Semester Marks
0 Samreena SDA 6th 100
1 Affan Ethics 7th 90
2 Mirha Web Design 5th 80
3 Asif Web Development 8th 70
Use the isin()
Function to Filter Pandas DataFrame
We can filter pandas DataFrame
rows using the isin()
method similar to the IN
operator in SQL.
To filter rows, will check the desired elements in a single column. Using the pd.series.isin()
function, we can check whether the search elements are present in the series.
If the element will match in the series, it returns true
otherwise false
.
For example, we want to return rows that contain Web Design
and Web Development
subjects in the Subject
column.
import pandas as pd
student_record = {
"Name": ["Samreena", "Affan", "Mirha", "Asif"],
"Subject": ["SDA", "Ethics", "Web Design", "Web Development"],
"Semester": ["6th", "7th", "5th", "8th"],
"Marks": [100, 90, 80, 70],
}
index_labels = [0, 1, 2, 3]
dataframe = pd.DataFrame(student_record, index=index_labels)
# Find elements in a Column to return rows
subjects_list = ["Web Design", "Web Development"]
dataframe1 = dataframe[dataframe.Subject.isin(subjects_list)]
print(dataframe1)
Output:
Name Subject Semester Marks
2 Mirha Web Design 5th 80
3 Asif Web Development 8th 70
Notice that only those student names are displayed Web Development
and Web Design
subjects are returned.
We can return a Boolean array by displaying true
and false
with Pandas DataFrame row indices.
import pandas as pd
student_record = {
"Name": ["Samreena", "Affan", "Mirha", "Asif"],
"Subject": ["SDA", "Ethics", "Web Design", "Web Development"],
"Semester": ["6th", "7th", "5th", "8th"],
"Marks": [100, 90, 80, 70],
}
index_labels = [0, 1, 2, 3]
dataframe = pd.DataFrame(student_record, index=index_labels)
subjects_list = ["Web Design", "Web Development"]
dataframe1 = dataframe.Subject.isin(subjects_list)
print(dataframe1)
Output:
0 False
1 False
2 True
3 True
Name: Subject, dtype: bool
Use the isin()
Method to Filter Multiple Columns in Pandas Dataframe
We can also apply a filter on multiple columns using the isin()
method. For example, we want to retrieve all those rows having the SDA
subject or the fifth semester.
import pandas as pd
student_record = {
"Name": ["Samreena", "Affan", "Mirha", "Asif"],
"Subject": ["SDA", "Ethics", "Web Design", "Web Development"],
"Semester": ["6th", "7th", "5th", "8th"],
"Marks": [100, 90, 80, 70],
}
index_labels = [0, 1, 2, 3]
dataframe = pd.DataFrame(student_record, index=index_labels)
dataframe1 = dataframe[
dataframe[["Subject", "Semester"]].isin(["SDA", "7th"]).any(axis=1)
]
print(dataframe1)
Output:
Name Subject Semester Marks
0 Samreena SDA 6th 100
1 Affan Ethics 7th 90
Use the isin()
Method With Not (~)
Matching Condition to Filter Pandas Dataframe
The isin()
method behaves like the IN
operator in SQL. We will use the unary operator (~)
to implement the Not IN
operator.
For example, we want to display only those rows that do not contain the Web Design
and Ethics
subjects.
import pandas as pd
student_record = {
"Name": ["Samreena", "Affan", "Mirha", "Asif"],
"Subject": ["SDA", "Ethics", "Web Design", "Web Development"],
"Semester": ["6th", "7th", "5th", "8th"],
"Marks": [100, 90, 80, 70],
}
index_labels = [0, 1, 2, 3]
dataframe = pd.DataFrame(student_record, index=index_labels)
subjects_list = ["Web Design", "Ethics"]
# Applying Not operator
dataframe1 = dataframe[~dataframe.Subject.isin(subjects_list)]
print(dataframe1)
Output:
Name Subject Semester Marks
0 Samreena SDA 6th 100
3 Asif Web Development 8th 70