Pandas DataFrame DataFrame.replace() Function
-
Syntax of
pandas.DataFrame.replace(): -
Example Codes: Replace Values in DataFrame Using
pandas.DataFrame.replace() -
Example Codes: Replace Multiple Values in DataFrame Using
pandas.DataFrame.replace()
pandas.DataFrame.replace() replaces values in DataFrame with other values, which may be string, regex, list, dictionary, Series, or a number.
Syntax of pandas.DataFrame.replace():
DataFrame.replace(,
to_replace=None,
value=None,
inplace=False,
limit=None,
regex=False,
method='pad')
Parameters
to_replace |
string, regex, list, dictionary, Series, numeric, or None. Values in DataFrame that need to be replaced |
value |
scalar, dict, list, string, regex, or None. Value to replace any values matching to_replace with |
inplace |
Boolean. If True modify the caller DataFrame |
limit |
Integer. Maximum size gap to forward or backward fill |
regex |
Boolean. Set regex to True if to_replace and/or value is a regex |
method |
Method used for replacement |
Return
It returns a DataFrame replacing all the specified fields by given value.
Example Codes: Replace Values in DataFrame Using pandas.DataFrame.replace()
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
'Y': [4, 1, 8]})
print("Before Replacement")
print(df)
replaced_df=df.replace(1, 5)
print("After Replacement")
print(replaced_df)
Output:
Before Replacement
X Y
0 1 4
1 2 1
2 3 8
After Replacement
X Y
0 5 4
1 2 5
2 3 8
Here, 1 represents to_replace parameter and 5 represents value parameter in the replace() method. Hence all the entries with value 1 are replaced by 5 in the df.
Example Codes: Replace Multiple Values in DataFrame Using pandas.DataFrame.replace()
Replace Using Lists
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
'Y': [4, 1, 8]})
print("Before Replacement")
print(df)
replaced_df=df.replace([1,2,3],[1,4,9])
print("After Replacement")
print(replaced_df)
Output:
Before Replacement
X Y
0 1 4
1 2 1
2 3 8
After Replacement
X Y
0 1 4
1 4 1
2 9 8
Here, [1,2,3] represents to_replace parameter and [1,4,9] represents value parameter in the replace() method. Hence the column [1,2,3] is replaced by [1,4,9] in the df.
Replace Using Dictionaries
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
'Y': [3, 1, 8]})
print("Before Replacement")
print(df)
replaced_df=df.replace({1:10,3:30})
print("After Replacement")
print(replaced_df)
Output:
Before Replacement
X Y
0 1 3
1 2 1
2 3 8
After Replacement
X Y
0 10 30
1 2 10
2 30 8
It replaces all the elements with value 1 by 10 and all the elements with value 3 by 30.
Replace Using Regex
import pandas as pd
df = pd.DataFrame({'X': ["zeppy", "amid", "amily"],
'Y': ["xar", "abc", "among"]})
print("Before Replacement")
print(df)
df.replace(to_replace=r'^ami.$', value='song', regex=True,inplace=True)
print("After Replacement")
print(df)
Output:
Before Replacement
X Y
0 zeppy xar
1 amid abc
2 amily among
After Replacement
X Y
0 zeppy xar
1 song abc
2 amily among
It replaces all the elements with the first three characters as ami followed by any one character with song. Here only amid satisfies the given regex and hence only amid is replaced by song. Although amily also has its first three characters ami but there are two characters after ami. So, amily does not satisfy the given regex and hence it remains the same and not replaced. If you are using regex, make sure regex is set to True and inplace=True modifies the original DataFrame after calling the replace() method on it.
Suraj Joshi is a backend software engineer at Matrice.ai.
LinkedIn