How to Convert String to Numeric Type in Pandas
-
pandas.to_numeric()
Method -
Convert String Values of Pandas DataFrame to Numeric Type Using the
pandas.to_numeric()
Method - Convert String Values of Pandas DataFrame to Numeric Type With Other Characters in It
This tutorial explains how we can convert string values of Pandas DataFrame to numeric type using the pandas.to_numeric()
method.
import pandas as pd
items_df = pd.DataFrame(
{
"Id": [302, 504, 708, 103, 343, 565],
"Name": ["Watch", "Camera", "Phone", "Shoes", "Laptop", "Bed"],
"Cost": ["300", "400", "350", "100", "1000", "400"],
}
)
print(items_df)
Output:
Id Name Cost
0 302 Watch 300
1 504 Camera 400
2 708 Phone 350
3 103 Shoes 100
4 343 Laptop 1000
5 565 Bed 400
We will use the above example to demonstrate how we can change the values of DataFrame to the numeric type.
pandas.to_numeric()
Method
Syntax
pandas.to_numeric(arg, errors="raise", downcast=None)
It converts the argument passed as arg
to the numeric type. By default, the arg
will be converted to int64
or float64
. We can set the value for the downcast
parameter to convert the arg
to other datatypes.
Convert String Values of Pandas DataFrame to Numeric Type Using the pandas.to_numeric()
Method
import pandas as pd
items_df = pd.DataFrame(
{
"Id": [302, 504, 708, 103, 343, 565],
"Name": ["Watch", "Camera", "Phone", "Shoes", "Laptop", "Bed"],
"Cost": ["300", "400", "350", "100", "1000", "400"],
}
)
print("The items DataFrame is:")
print(items_df, "\n")
print("Datatype of Cost column before type conversion:")
print(items_df["Cost"].dtypes, "\n")
items_df["Cost"] = pd.to_numeric(items_df["Cost"])
print("Datatype of Cost column after type conversion:")
print(items_df["Cost"].dtypes)
Output:
The items DataFrame is:
Id Name Cost
0 302 Watch 300
1 504 Camera 400
2 708 Phone 350
3 103 Shoes 100
4 343 Laptop 1000
5 565 Bed 400
Datatype of Cost column before type conversion:
object
Datatype of Cost column after type conversion:
int64
It converts the data type of the Cost
column of the items_df
from object
to int64
.
Convert String Values of Pandas DataFrame to Numeric Type With Other Characters in It
If we want to convert a column to a numeric type with values with some characters in it, we get an error saying ValueError: Unable to parse string
. In such cases, we can remove all the non-numeric characters and then perform type conversion.
import pandas as pd
items_df = pd.DataFrame(
{
"Id": [302, 504, 708, 103, 343, 565],
"Name": ["Watch", "Camera", "Phone", "Shoes", "Laptop", "Bed"],
"Cost": ["$300", "$400", "$350", "$100", "$1000", "$400"],
}
)
print("The items DataFrame is:")
print(items_df, "\n")
print("Datatype of Cost column before type conversion:")
print(items_df["Cost"].dtypes, "\n")
items_df["Cost"] = pd.to_numeric(items_df["Cost"].str.replace("$", ""))
print("Datatype of Cost column after type conversion:")
print(items_df["Cost"].dtypes, "\n")
print("DataFrame after Type Conversion:")
print(items_df)
Output:
The items DataFrame is:
Id Name Cost
0 302 Watch $300
1 504 Camera $400
2 708 Phone $350
3 103 Shoes $100
4 343 Laptop $1000
5 565 Bed $400
Datatype of Cost column before type conversion:
object
Datatype of Cost column after type conversion:
int64
DataFrame after Type Conversion:
Id Name Cost
0 302 Watch 300
1 504 Camera 400
2 708 Phone 350
3 103 Shoes 100
4 343 Laptop 1000
5 565 Bed 400
It removes the $
character attached with the Cost
column’s values and then converts these values into the numeric type using the pandas.to_numeric()
method.
Suraj Joshi is a backend software engineer at Matrice.ai.
LinkedIn