Pandas DataFrame DataFrame.interpolate() Function
-
Syntax of
pandas.DataFrame.interpolate()
: -
Example Codes: Interpolate All
NaN
Values inDataFrame
WithDataFrame.interpolate()
Method -
Example Codes:
DataFrame.interpolate()
Method With themethod
Parameter -
Example Codes: Pandas
DataFrame.interpolate()
Method With theaxis
Parameter to Interpolate Alongrow
Axis -
Example Codes:
DataFrame.interpolate()
Method Withlimit
Parameter -
Example Codes:
DataFrame.interpolate()
Method Withlimit_direction
Parameter -
Interpolate Time-Series Data With
DataFrame.interpolate()
Method
The Python Pandas DataFrame.interpolate()
function fills NaN
values in the DataFrame using the interpolation technique.
Syntax of pandas.DataFrame.interpolate()
:
DataFrame.interpolate(
method="linear",
axis=0,
limit=None,
inplace=False,
limit_direction="forward",
limit_area=None,
downcast=None,
**kwargs
)
Parameters
method |
linear , time , index , values , nearest , zero , slinear , quadratic , cubic , barycentric , krogh , polynomial , spline , piecewise_polynomial , from_derivatives , pchip , akima or None . Method used for interpolating NaN values. |
axis |
Interpolate missing values along the row (axis=0) or column (axis=1) |
limit |
Integer. maximum number of consecutive NaNs to be interpolated. |
inplace |
Boolean. If True , modify the caller DataFrame in-place |
limit_direction |
forward , backward or both . Direction along NaNs are interpolated when the limit is specified |
limit_area |
None , inside , or outside . Restriction for interpolating when the limit is specified |
downcast |
Dictionary. Specifies downcast of datatypes |
**kwargs |
Keyword arguments for the interpolating function. |
Return
If inplace
is True
, a DataFrame
interpolating all the NaN
values using given method
; otherwise None
.
Example Codes: Interpolate All NaN
Values in DataFrame
With DataFrame.interpolate()
Method
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3, None, 3],
'Y': [4, None, 8, None, 3]})
print("DataFrame:")
print(df)
filled_df = df.interpolate()
print("Interploated DataFrame:")
print(filled_df)
Output:
DataFrame:
X Y
0 1.0 4.0
1 2.0 NaN
2 3.0 8.0
3 NaN NaN
4 3.0 3.0
Interploated DataFrame:
X Y
0 1.0 4.0
1 2.0 6.0
2 3.0 8.0
3 3.0 5.5
4 3.0 3.0
It interpolates all the NaN
values in DataFrame
using the linear
interpolation method.
This method is more intelligent compared to pandas.DataFrame.fillna()
, which uses a fixed value to replace all the NaN
values in the DataFrame.
Example Codes: DataFrame.interpolate()
Method With the method
Parameter
We can also interpolate NaN
values in DataFrame
with different interpolation techniques setting values of method
parameter in DataFrame.interpolate()
function.
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3, None, 3],
'Y': [4, None, 8, None, 3]})
print("DataFrame:")
print(df)
filled_df = df.interpolate(method='polynomial', order=2)
print("Interploated DataFrame:")
print(filled_df)
Output:
DataFrame:
X Y
0 1.0 4.0
1 2.0 NaN
2 3.0 8.0
3 NaN NaN
4 3.0 3.0
Interploated DataFrame:
X Y
0 1.000000 4.000
1 2.000000 7.125
2 3.000000 8.000
3 3.368421 6.625
4 3.000000 3.000
This method interpolates all the NaN
values in the DataFrame
using the polynomial
interpolation method of 2nd order.
Here, order=2
is the keyword argument for the polynomial
function.
Example Codes: Pandas DataFrame.interpolate()
Method With the axis
Parameter to Interpolate Along row
Axis
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3, None, 3],
'Y': [4, None, 8, None, 3]})
print("DataFrame:")
print(df)
filled_df = df.interpolate(axis=1)
print("Interploated DataFrame:")
print(filled_df)
Output:
DataFrame:
X Y
0 1.0 4.0
1 2.0 NaN
2 3.0 8.0
3 NaN NaN
4 3.0 3.0
Interploated DataFrame:
X Y
0 1.0 4.0
1 2.0 2.0
2 3.0 8.0
3 NaN NaN
4 3.0 3.0
Here, we set axis=1
to interpolate the NaN
values along the row axis. In the 2nd row, NaN
value is replaced using linear interpolation along the 2nd row.
However, in the 4th row, the NaN
values remain even after interpolation, as both the values in the 4th row are NaN
.
Example Codes: DataFrame.interpolate()
Method With limit
Parameter
The limit
parameter in DataFrame.interpolate()
method restricts the maximum number of consecutive NaN
values to be filled by the method.
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3, None, 3],
'Y': [4, None, None, None, 3]})
print("DataFrame:")
print(df)
filled_df = df.interpolate( limit = 1)
print("Interploated DataFrame:")
print(filled_df)
Output:
DataFrame:
X Y
0 1.0 4.0
1 2.0 NaN
2 3.0 NaN
3 NaN NaN
4 3.0 3.0
Interploated DataFrame:
X Y
0 1.0 4.00
1 2.0 3.75
2 3.0 NaN
3 3.0 NaN
4 3.0 3.00
Here, once a NaN
is filled in a column from the top, the next consecutive NaN
values in the same column remain unchanged.
Example Codes: DataFrame.interpolate()
Method With limit_direction
Parameter
The limit-direction
parameter in DataFrame.interpolate()
method controls the direction along a particular axis, in which values are interpolated.
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3, None, 3],
'Y': [4, None, None, None, 3]})
print("DataFrame:")
print(df)
filled_df = df.interpolate(limit_direction ='backward', limit = 1)
print("Interploated DataFrame:")
print(filled_df)
Output:
DataFrame:
X Y
0 1.0 4.0
1 2.0 NaN
2 3.0 NaN
3 NaN NaN
4 3.0 3.0
Interploated DataFrame:
X Y
0 1.0 4.00
1 2.0 NaN
2 3.0 NaN
3 3.0 3.25
4 3.0 3.00
Here, once a NaN
is filled in a column from the bottom, the next consecutive NaN
values in the same column remain unchanged.
Interpolate Time-Series Data With DataFrame.interpolate()
Method
import pandas as pd
dates=['April-10', 'April-11', 'April-12', 'April-13']
fruits=['Apple', 'Papaya', 'Banana', 'Mango']
prices=[3, None, 2, 4]
df = pd.DataFrame({'Date':dates ,
'Fruit':fruits ,
'Price': prices})
print(df)
df.interpolate(inplace=True)
print("Interploated DataFrame:")
print(df)
Output:
Date Fruit Price
0 April-10 Apple 3.0
1 April-11 Papaya NaN
2 April-12 Banana 2.0
3 April-13 Mango 4.0
Interploated DataFrame:
Date Fruit Price
0 April-10 Apple 3.0
1 April-11 Papaya 2.5
2 April-12 Banana 2.0
3 April-13 Mango 4.0
Due to inplace=True
, the original DataFrame
is modified after calling interpolate()
function.
Suraj Joshi is a backend software engineer at Matrice.ai.
LinkedIn