How to Convert String to Numeric Type in Pandas
- 
          
            
pandas.to_numeric()Method - 
          
            Convert String Values of Pandas DataFrame to Numeric Type Using the 
pandas.to_numeric()Method - Convert String Values of Pandas DataFrame to Numeric Type With Other Characters in It
 
This tutorial explains how we can convert string values of Pandas DataFrame to numeric type using the pandas.to_numeric() method.
import pandas as pd
items_df = pd.DataFrame(
    {
        "Id": [302, 504, 708, 103, 343, 565],
        "Name": ["Watch", "Camera", "Phone", "Shoes", "Laptop", "Bed"],
        "Cost": ["300", "400", "350", "100", "1000", "400"],
    }
)
print(items_df)
Output:
    Id    Name  Cost
0  302   Watch   300
1  504  Camera   400
2  708   Phone   350
3  103   Shoes   100
4  343  Laptop  1000
5  565     Bed   400
We will use the above example to demonstrate how we can change the values of DataFrame to the numeric type.
pandas.to_numeric() Method
Syntax
pandas.to_numeric(arg, errors="raise", downcast=None)
It converts the argument passed as arg to the numeric type. By default, the arg will be converted to int64 or float64. We can set the value for the downcast parameter to convert the arg to other datatypes.
Convert String Values of Pandas DataFrame to Numeric Type Using the pandas.to_numeric() Method
    
import pandas as pd
items_df = pd.DataFrame(
    {
        "Id": [302, 504, 708, 103, 343, 565],
        "Name": ["Watch", "Camera", "Phone", "Shoes", "Laptop", "Bed"],
        "Cost": ["300", "400", "350", "100", "1000", "400"],
    }
)
print("The items DataFrame is:")
print(items_df, "\n")
print("Datatype of Cost column before type conversion:")
print(items_df["Cost"].dtypes, "\n")
items_df["Cost"] = pd.to_numeric(items_df["Cost"])
print("Datatype of Cost column after type conversion:")
print(items_df["Cost"].dtypes)
Output:
The items DataFrame is:
    Id    Name  Cost
0  302   Watch   300
1  504  Camera   400
2  708   Phone   350
3  103   Shoes   100
4  343  Laptop  1000
5  565     Bed   400 
Datatype of Cost column before type conversion:
object 
Datatype of Cost column after type conversion:
int64
It converts the data type of the Cost column of the items_df from object to int64.
Convert String Values of Pandas DataFrame to Numeric Type With Other Characters in It
If we want to convert a column to a numeric type with values with some characters in it, we get an error saying ValueError: Unable to parse string. In such cases, we can remove all the non-numeric characters and then perform type conversion.
import pandas as pd
items_df = pd.DataFrame(
    {
        "Id": [302, 504, 708, 103, 343, 565],
        "Name": ["Watch", "Camera", "Phone", "Shoes", "Laptop", "Bed"],
        "Cost": ["$300", "$400", "$350", "$100", "$1000", "$400"],
    }
)
print("The items DataFrame is:")
print(items_df, "\n")
print("Datatype of Cost column before type conversion:")
print(items_df["Cost"].dtypes, "\n")
items_df["Cost"] = pd.to_numeric(items_df["Cost"].str.replace("$", ""))
print("Datatype of Cost column after type conversion:")
print(items_df["Cost"].dtypes, "\n")
print("DataFrame after Type Conversion:")
print(items_df)
Output:
The items DataFrame is:
    Id    Name   Cost
0  302   Watch   $300
1  504  Camera   $400
2  708   Phone   $350
3  103   Shoes   $100
4  343  Laptop  $1000
5  565     Bed   $400 
Datatype of Cost column before type conversion:
object 
Datatype of Cost column after type conversion:
int64 
DataFrame after Type Conversion:
    Id    Name  Cost
0  302   Watch   300
1  504  Camera   400
2  708   Phone   350
3  103   Shoes   100
4  343  Laptop  1000
5  565     Bed   400
It removes the $ character attached with the Cost column’s values and then converts these values into the numeric type using the pandas.to_numeric() method.
Suraj Joshi is a backend software engineer at Matrice.ai.
LinkedIn