How to Replace Column Values in Pandas DataFrame
-
Use the
map()
Method to Replace Column Values in Pandas -
Use the
loc
Method to Replace Column’s Value in Pandas - Replace Column Values With Conditions in Pandas DataFrame
-
Use the
replace()
Method to Modify Values
In this tutorial, we will introduce how to replace column values in Pandas DataFrame. We will cover three different functions to replace column values easily.
Use the map()
Method to Replace Column Values in Pandas
DataFrame’s columns are Pandas Series
. We can use the Series.map
method to replace each value in a column with another value.
Series.map()
Syntax
Series.map(arg, na_action=None)
- Parameters:
arg
: this parameter is used for mapping aSeries
. It could be a collection or a function.na_action
: It is used for dealing withNaN
(Not a Number
) values. It could take two values -None
orignore
.None
is the default, andmap()
will apply the mapping to all values, includingNan
values;ignore
leavesNaN
values as are in the column without passing them to the mapping method.
It returns a Series
with the same index.
Now let’s take an example to implement the map
method. We will use the same DataFrame
in the below examples.
import pandas as pd
import numpy as np
data = {
"name": ["michael", "louis", "jack", "jasmine"],
"city": ["berlin", "paris", "roma", np.nan],
}
df = pd.DataFrame(data, columns=["name", "city"])
print(df)
Output:
name city
0 michael berlin
1 louis paris
2 jack roma
3 jasmine NaN
Replace Column Values With Collection in Pandas DataFrame
import pandas as pd
import numpy as np
data = {
"name": ["michael", "louis", "jack", "jasmine"],
"city": ["berlin", "paris", "roma", np.nan],
}
df = pd.DataFrame(data, columns=["name", "city"])
# replace column values with collection
df["city"] = df["city"].map(
{"berlin": "dubai", "paris": "moscow", "roma": "milan", np.nan: "NY"},
na_action=None,
)
print(df)
Output:
name city
0 michael dubai
1 louis moscow
2 jack milan
3 jasmine NY
The original DataFrame city
column values are replaced with the dictionary’s new values as the first parameter in the map()
method.
Replace Column Values With Function in Pandas DataFrame
import pandas as pd
import numpy as np
data = {
"name": ["michael", "louis", "jack", "jasmine"],
"city": ["berlin", "paris", "roma", np.nan],
}
df = pd.DataFrame(data, columns=["name", "city"])
# replace column values with function
df["city"] = df["city"].map("I am from {}".format)
print(df)
Output:
name city
0 michael I am from berlin
1 louis I am from paris
2 jack I am from roma
3 jasmine I am from nan
The na_action
is None
by default, so that’s why the NaN
in the original column is also replaced with the new string I am from nan
.
If you prefer to keep NaN
but not replaced, you can set the na_action
to be ignore
.
import pandas as pd
import numpy as np
data = {
"name": ["michael", "louis", "jack", "jasmine"],
"city": ["berlin", "paris", "roma", np.nan],
}
df = pd.DataFrame(data, columns=["name", "city"])
# replace column values excluding NaN
df["city"] = df["city"].map("I am from {}".format, na_action="ignore")
print(df)
Output:
name city
0 michael I am from berlin
1 louis I am from paris
2 jack I am from roma
3 jasmine NaN
Use the loc
Method to Replace Column’s Value in Pandas
Another way to replace Pandas DataFrame column’s value is the loc()
method of the DataFrame
. The loc()
method access values through their labels.
DataFrame.loc[]
Syntax
pandas.DataFrame.loc[condition, column_label] = new_value
- Parameters:
condition
: this parameter returns the values that make the condition truecolumn_label
: this parameter used to specify the targeted column to update
After determining the value through the parameters, we update it to new_value
.
Now let’s take an example to implement the loc
method. We will use the below DataFrame
as the example.
import pandas as pd
data = {
"name": ["michael", "louis", "jack", "jasmine"],
"grades": [30, 70, 40, 80],
"result": ["N/A", "N/A", "N/A", "N/A"],
}
df = pd.DataFrame(data, columns=["name", "grades", "result"])
print(df)
Output:
name grades result
0 michael 30 N/A
1 louis 70 N/A
2 jack 40 N/A
3 jasmine 80 N/A
Replace Column Values With Conditions in Pandas DataFrame
We can use boolean conditions to specify the targeted elements.
import pandas as pd
data = {
"name": ["michael", "louis", "jack", "jasmine"],
"grades": [30, 70, 40, 80],
"result": ["N/A", "N/A", "N/A", "N/A"],
}
df = pd.DataFrame(data, columns=["name", "grades", "result"])
df.loc[df.grades > 50, "result"] = "success"
df.loc[df.grades < 50, "result"] = "fail"
print(df)
Output:
name grades result
0 michael 30 fail
1 louis 70 success
2 jack 40 fail
3 jasmine 80 success
df.loc[df.grades>50, 'result']='success'
replaces the values in the grades
column with sucess
if the values is greather than 50.
df.loc[df.grades<50,'result']='fail'
replaces the values in the grades
column with fail
if the values is smaller than 50.
Use the replace()
Method to Modify Values
Another way to replace column values in Pandas DataFrame is the Series.replace()
method.
Series.replace()
Syntax
- Replace one single value
df[column_name].replace([old_value], new_value)
- Replace multiple values with the same value
df[column_name].replace([old_value1, old_value2, old_value3], new_value)
- Replace multiple values with multiple values
df[column_name].replace(
[old_value1, old_value2, old_value3], [new_value1, new_value2, new_value3]
)
- Replace a value with a new value for the entire DataFrame
df.replace([old_value], new_value)
We will use the below DataFrame for the rest of examples.
import pandas as pd
data = {
"name": ["michael", "louis", "jack", "jasmine"],
"salary": [700, 800, 1000, 1200],
}
df = pd.DataFrame(data, columns=["name", "salary"])
print(df)
Output:
name salary
0 michael 700
1 louis 800
2 jack 1000
3 jasmine 1200
Replace Column Values With Multiple Values in Pandas DataFrame
import pandas as pd
data = {
"name": ["michael", "louis", "jack", "jasmine"],
"salary": [700, 800, 1000, 1200],
}
df = pd.DataFrame(data, columns=["name", "salary"])
df["name"] = df["name"].replace(["michael", "louis"], ["karl", "lionel"])
print(df)
Output:
name salary
0 karl 700
1 lionel 800
2 jack 1000
3 jasmine 1200
Replace Column Values With Only the Same Value in Pandas DataFrame
import pandas as pd
data = {
"name": ["michael", "louis", "jack", "jasmine"],
"salary": [700, 800, 1000, 1200],
}
df = pd.DataFrame(data, columns=["name", "salary"])
df["salary"] = df["salary"].replace([1000, 1200], 1500)
print(df)
Output:
name salary
0 karl 700
1 lionel 800
2 jack 1500
3 jasmine 1500
Replace Column Value With One Value in Pandas DataFrame
import pandas as pd
data = {
"name": ["michael", "louis", "jack", "jasmine"],
"salary": [700, 800, 1000, 1200],
}
df = pd.DataFrame(data, columns=["name", "salary"])
df["salary"] = df["salary"].replace([700], 750)
print(df)
Output:
name salary
0 karl 750
1 lionel 800
2 jack 1000
3 jasmine 1200
Replace Values in the Entire Pandas DataFrame
import pandas as pd
data = {
"name": ["michael", "louis", "jack", "jasmine"],
"salary": [700, 800, 1000, 1000],
}
df = pd.DataFrame(data, columns=["name", "salary"])
df = df.replace([1000], 1400)
print(df)
Output:
name salary
0 karl 750
1 lionel 800
2 jack 1400
3 jasmine 1400