How to Convert DataFrame Column to Datetime in Pandas
-
Use
pd.to_datetime()
to Convert DataFrame Column toDatetime
-
Use
pd.astype()
to Convert DataFrame Column toDatetime
-
Use
lambda
to Convert DataFrame Column toDatetime
-
Use
apply()
to Convert Multiple DataFrame Columns toDatetime
Sometimes, we have to encounter time series data while dealing with data using the pandas
library in Python. Here, we may need to convert the string or object type column to the datetime
format.
Today’s tutorial demonstrates the use of pd.to_datetime()
, astype()
, lambda
, and apply()
methods to convert a string or object type column to the datetime
format.
Use pd.to_datetime()
to Convert DataFrame Column to Datetime
Example Code:
import pandas as pd
df = pd.DataFrame({"Date": ["12/08/2019", "08/01/2020", "11/6/2021", "12/12/2022"]})
df["converted_to_datetime"] = pd.to_datetime(df["Date"])
print(df.info())
OUTPUT:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Date 4 non-null object
1 converted_to_datetime 4 non-null datetime64[ns]
dtypes: datetime64[ns](1), object(1)
memory usage: 192.0+ bytes
None
This output shows that we have two columns, Date
and converted_to_datetime
of type object
and datetime64[ns]
, respectively. How did we convert the object
type to datetime
?
Let’s find out below.
First, we imported a Python library named pandas
and set the alias as pd
to avoid keystrokes. Then, we created a dataframe containing one column named Date
holding dates in the DD/MM/YYYY
format.
Here, we are not required to change the format but convert the data type of the Date
column from object
to datetime
. For that, we use the to_datetime()
method and pass a series to it to convert a dataframe column from object
to datetime
.
The to_datetime()
method can take array-like, series, dataframe, scalar, or dict-like as function arguments and convert the specified argument to the Pandas datetime
object. Here, info()
concisely summarized a dataframe.
Suppose we want dates in a specific format; how can we do it? Let’s practice the same example to change the date format.
Use to_datetime()
With the format
Attribute to Change the Date Format
Example Code:
import pandas as pd
df = pd.DataFrame({"Date": ["12/08/2019", "08/01/2020", "11/6/2021", "12/12/2022"]})
df["converted_to_datetime"] = pd.to_datetime(df["Date"], format="%m/%d/%Y")
print(df.info())
OUTPUT:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Date 4 non-null object
1 converted_to_datetime 4 non-null datetime64[ns]
dtypes: datetime64[ns](1), object(1)
memory usage: 192.0+ bytes
None
This code snippet is similar to the last example and produces the same output, but now, we have dates in the MM/DD/YYYY
format using the format
attribute, which is used to specify the date format. We can also specify the time as %H:%M:%S
if we want to.
Use to_datetime()
With the infer_datetime_format
Attribute
Example Code:
import pandas as pd
df = pd.DataFrame({"Date": ["12/08/2019", "08/01/2020", "11/6/2021", "12/12/2022"]})
df["converted_to_datetime"] = pd.to_datetime(df["Date"], infer_datetime_format=True)
print(df.info())
OUTPUT:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Date 4 non-null object
1 converted_to_datetime 4 non-null datetime64[ns]
dtypes: datetime64[ns](1), object(1)
memory usage: 192.0+ bytes
None
In the previous example code, we had to use the format
attribute while using the to_datetime()
method to specify the date format but setting infer_datetime_format
to True
helps us to detect the date format automatically and convert the specified dataframe column to datetime
.
Use pd.astype()
to Convert DataFrame Column to Datetime
Example Code:
import pandas as pd
df = pd.DataFrame({"Date": ["12/08/2019", "08/01/2020", "11/6/2021", "12/12/2022"]})
df["converted_to_datetime"] = df["Date"].astype("datetime64[ns]")
print(df.info())
OUTPUT:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Date 4 non-null object
1 converted_to_datetime 4 non-null datetime64[ns]
dtypes: datetime64[ns](1), object(1)
memory usage: 192.0+ bytes
None
This code fence uses the series.astype()
method to convert a dataframe column to the datetime
. We can use this function to convert a string column to the datetime
type in the Pandas dataframe.
To convert the data type, we must specify a data type as an argument; in our case, it is 'datetime64[ns]'
.
Use lambda
to Convert DataFrame Column to Datetime
Example Code:
import pandas as pd
from datetime import datetime
df = pd.DataFrame({"Date": ["12/08/2019", "08/01/2020", "11/6/2021", "12/12/2022"]})
df["converted_to_datetime"] = df["Date"].apply(
lambda x: datetime.strptime(x, "%m/%d/%Y")
)
print(df.info())
OUTPUT:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Date 4 non-null object
1 converted_to_datetime 4 non-null datetime64[ns]
dtypes: datetime64[ns](1), object(1)
memory usage: 192.0+ bytes
None
In this code snippet, we used the DataFrame.apply()
method with the lambda
function to operate on values. Here, we used the datetime.strptime()
method to convert a dataframe column to datetime
considering the specified date format.
Why use the lambda
expression? It is used in place of a function for simplicity and ease of reading.
Don’t forget to import datetime
before using it.
Then suppose we have inserted
and updated
columns that we want to convert into datetime
. Let’s learn that below.
Use apply()
to Convert Multiple DataFrame Columns to Datetime
Example Code:
import pandas as pd
from datetime import datetime
df = pd.DataFrame(
{
"Insert_Date": ["12/08/2019", "08/01/2020", "11/6/2021", "12/12/2022"],
"Update_Date": ["13/08/2019", "10/01/2020", "15/6/2021", "20/12/2022"],
}
)
df[["Insert_DateTime", "Update_DateTime"]] = df[["Insert_Date", "Update_Date"]].apply(
pd.to_datetime, errors="coerce", infer_datetime_format=True
)
print(df.info())
OUTPUT:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 4 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Insert_Date 4 non-null object
1 Update_Date 4 non-null object
2 Insert_DateTime 4 non-null datetime64[ns]
3 Update_DateTime 4 non-null datetime64[ns]
dtypes: datetime64[ns](2), object(2)
memory usage: 256.0+ bytes
None
Here, we are using the apply()
method to convert Insert_Date
and Update_Date
to datetime
in separate columns (Insert_DateTime
and Update_DateTime
). The apply()
method takes two parameters.
The first parameter is the function name we want to apply; in this case, it is pd.to_datetime
. The second parameter is infer_datetime_format
, which we set to True
to get a consistent date format.
Note that we can also use astype()
to cast multiple columns.