Pandas DataFrame DataFrame.replace() Function
-
Syntax of
pandas.DataFrame.replace()
: -
Example Codes: Replace Values in DataFrame Using
pandas.DataFrame.replace()
-
Example Codes: Replace Multiple Values in DataFrame Using
pandas.DataFrame.replace()
pandas.DataFrame.replace()
replaces values in DataFrame with other values, which may be string, regex, list, dictionary, Series
, or a number.
Syntax of pandas.DataFrame.replace()
:
DataFrame.replace(,
to_replace=None,
value=None,
inplace=False,
limit=None,
regex=False,
method='pad')
Parameters
to_replace |
string, regex, list, dictionary, Series, numeric, or None . Values in DataFrame that need to be replaced |
value |
scalar, dict, list, string, regex, or None. Value to replace any values matching to_replace with |
inplace |
Boolean. If True modify the caller DataFrame |
limit |
Integer. Maximum size gap to forward or backward fill |
regex |
Boolean. Set regex to True if to_replace and/or value is a regex |
method |
Method used for replacement |
Return
It returns a DataFrame
replacing all the specified fields by given value
.
Example Codes: Replace Values in DataFrame Using pandas.DataFrame.replace()
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
'Y': [4, 1, 8]})
print("Before Replacement")
print(df)
replaced_df=df.replace(1, 5)
print("After Replacement")
print(replaced_df)
Output:
Before Replacement
X Y
0 1 4
1 2 1
2 3 8
After Replacement
X Y
0 5 4
1 2 5
2 3 8
Here, 1
represents to_replace
parameter and 5
represents value
parameter in the replace()
method. Hence all the entries with value 1
are replaced by 5
in the df
.
Example Codes: Replace Multiple Values in DataFrame Using pandas.DataFrame.replace()
Replace Using Lists
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
'Y': [4, 1, 8]})
print("Before Replacement")
print(df)
replaced_df=df.replace([1,2,3],[1,4,9])
print("After Replacement")
print(replaced_df)
Output:
Before Replacement
X Y
0 1 4
1 2 1
2 3 8
After Replacement
X Y
0 1 4
1 4 1
2 9 8
Here, [1,2,3]
represents to_replace
parameter and [1,4,9]
represents value
parameter in the replace()
method. Hence the column [1,2,3]
is replaced by [1,4,9]
in the df
.
Replace Using Dictionaries
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
'Y': [3, 1, 8]})
print("Before Replacement")
print(df)
replaced_df=df.replace({1:10,3:30})
print("After Replacement")
print(replaced_df)
Output:
Before Replacement
X Y
0 1 3
1 2 1
2 3 8
After Replacement
X Y
0 10 30
1 2 10
2 30 8
It replaces all the elements with value 1
by 10
and all the elements with value 3
by 30
.
Replace Using Regex
import pandas as pd
df = pd.DataFrame({'X': ["zeppy", "amid", "amily"],
'Y': ["xar", "abc", "among"]})
print("Before Replacement")
print(df)
df.replace(to_replace=r'^ami.$', value='song', regex=True,inplace=True)
print("After Replacement")
print(df)
Output:
Before Replacement
X Y
0 zeppy xar
1 amid abc
2 amily among
After Replacement
X Y
0 zeppy xar
1 song abc
2 amily among
It replaces all the elements with the first three characters as ami
followed by any one character with song
. Here only amid
satisfies the given regex and hence only amid
is replaced by song
. Although amily
also has its first three characters ami
but there are two characters after ami
. So, amily
does not satisfy the given regex and hence it remains the same and not replaced. If you are using regex, make sure regex
is set to True
and inplace=True
modifies the original DataFrame
after calling the replace()
method on it.
Suraj Joshi is a backend software engineer at Matrice.ai.
LinkedIn