Pandas Insert Method
-
pandas.DataFrame.insert()
Method in Python -
Set
allow_duplicates = True
in theinsert()
Method to Add Already Existing Column
This tutorial explains how we can use the insert()
method for a Pandas DataFrame to insert a column in the DataFrame.
import pandas as pd
countries_df = pd.DataFrame(
{
"Country": ["Nepal", "Switzerland", "Germany", "Canada"],
"Continent": ["Asia", "Europe", "Europe", "North America"],
"Primary Language": ["Nepali", "French", "German", "English"],
}
)
print("Countries DataFrame:")
print(countries_df, "\n")
Output:
Countries DataFrame:
Country Continent Primary Language
0 Nepal Asia Nepali
1 Switzerland Europe French
2 Germany Europe German
3 Canada North America English
We will use the countries_df
DataFrame shown in the above example to explain how we can use the insert()
method for a Pandas DataFrame to insert a column in the DataFrame.
pandas.DataFrame.insert()
Method in Python
Syntax
DataFrame.insert(loc, column, value, allow_duplicates=False)
It inserts the column named column
into the DataFrame
with values specified by value
at location loc
.
Insert a Column Having the Same Value for All Rows Using the insert()
Method
import pandas as pd
countries_df = pd.DataFrame(
{
"Country": ["Nepal", "Switzerland", "Germany", "Canada"],
"Continent": ["Asia", "Europe", "Europe", "North America"],
"Primary Language": ["Nepali", "French", "German", "English"],
}
)
print("Countries DataFrame:")
print(countries_df, "\n")
countries_df.insert(3, "Capital", "Unknown")
print("Countries DataFrame after inserting Capital column:")
print(countries_df)
Output:
Countries DataFrame:
Country Continent Primary Language
0 Nepal Asia Nepali
1 Switzerland Europe French
2 Germany Europe German
3 Canada North America English
Countries DataFrame after inserting Capital column:
Country Continent Primary Language Capital
0 Nepal Asia Nepali Unknown
1 Switzerland Europe French Unknown
2 Germany Europe German Unknown
3 Canada North America English Unknown
It inserts the column Capital
in the countries_df
DataFrame at position 3
with the same value of the column for all rows set to Unknown
.
The position starts from 0
and hence position 3
refers to the 4th
column in the DataFrame.
Insert a Column in a DataFrame Specifying Value for Each Row
If we want to specify the values of each row for the column to be inserted using the insert()
method, we can pass a list of values as a value
argument in the insert()
method.
import pandas as pd
countries_df = pd.DataFrame(
{
"Country": ["Nepal", "Switzerland", "Germany", "Canada"],
"Continent": ["Asia", "Europe", "Europe", "North America"],
"Primary Language": ["Nepali", "French", "German", "English"],
}
)
print("Countries DataFrame:")
print(countries_df, "\n")
capitals = ["Kathmandu", "Zurich", "Berlin", "Ottawa"]
countries_df.insert(2, "Capital", capitals)
print("Countries DataFrame after inserting Capital column:")
print(countries_df)
Output:
Countries DataFrame:
Country Continent Primary Language
0 Nepal Asia Nepali
1 Switzerland Europe French
2 Germany Europe German
3 Canada North America English
Countries DataFrame after inserting Capital column:
Country Continent Capital Primary Language
0 Nepal Asia Kathmandu Nepali
1 Switzerland Europe Zurich French
2 Germany Europe Berlin German
3 Canada North America Ottawa English
It inserts the column Capital
in the DataFrame countries_df
at position 2
with specified values of each row for the Capital
column in the DataFrame.
Set allow_duplicates = True
in the insert()
Method to Add Already Existing Column
import pandas as pd
countries_df = pd.DataFrame(
{
"Country": ["Nepal", "Switzerland", "Germany", "Canada"],
"Continent": ["Asia", "Europe", "Europe", "North America"],
"Primary Language": ["Nepali", "French", "German", "English"],
"Capital": ["Kathmandu", "Zurich", "Berlin", "Ottawa"],
}
)
print("Countries DataFrame:")
print(countries_df, "\n")
capitals = ["Kathmandu", "Zurich", "Berlin", "Ottawa"]
countries_df.insert(4, "Capital", capitals, allow_duplicates=True)
print("Countries DataFrame after inserting Capital column:")
print(countries_df)
Output:
Countries DataFrame:
Country Continent Primary Language Capital
0 Nepal Asia Nepali Kathmandu
1 Switzerland Europe French Zurich
2 Germany Europe German Berlin
3 Canada North America English Ottawa
Countries DataFrame after inserting Capital column:
Country Continent Primary Language Capital Capital
0 Nepal Asia Nepali Kathmandu Kathmandu
1 Switzerland Europe French Zurich Zurich
2 Germany Europe German Berlin Berlin
3 Canada North America English Ottawa Ottawa
It adds the column Capital
to the countries_df
DataFrame even though the column Capital
already exists in the countries_df
DataFrame.
If we try to insert the column that already exists in the DataFrame without setting allow_duplicates = True
in the insert()
method, it will throw us an error with the message: ValueError: cannot insert column, already exists.
Suraj Joshi is a backend software engineer at Matrice.ai.
LinkedIn