Pandas DataFrame DataFrame.fillna() Function
-
Syntax of
pandas.DataFrame.fillna()
: -
Example Codes: Fill All
NaN
Values inDataFrame
WithDataFrame.fillna()
Method -
Example Codes:
DataFrame.fillna()
Method With themethod
Parameter -
Example Codes:
DataFrame.fillna()
Method Withlimit
Parameter
pandas.DataFrame.fillna()
function replaces NaN
values in DataFrame
with some certain value.
Syntax of pandas.DataFrame.fillna()
:
DataFrame.fillna(
value=None, method=None, axis=None, inplace=False, limit=None, downcast=None
)
Parameters
value |
scalar , dict , Series , or DataFrame . Value used to replace NaN values |
method |
backfill , bfill , pad , ffill or None . Method used for filling NaN values. |
axis |
Fill missing values along the row (axis=0) or column (axis=1) |
inplace |
Boolean. If True , modify the caller DataFrame in-place |
limit |
Integer. If the method is specified, it is the maximum number of consecutive NaN values to forward/backward fill. If the method is not given, it is the maximum number of NaN in axis to be filled. |
downcast |
Dictionary. Specifies downcast of datatypes |
Return
If inplace
is True
, a DataFrame
replacing all the NaN
values by given value
; otherwise None
.
Example Codes: Fill All NaN
Values in DataFrame
With DataFrame.fillna()
Method
import pandas as pd
import numpy as np
df = pd.DataFrame({'X': [1, 2, 3, np.nan, 3],
'Y': [4, np.nan, 8, np.nan, 3]})
print("DataFrame:")
print(df)
filled_df = df.fillna(5)
print("Filled DataFrame:")
print(filled_df)
Output:
DataFrame:
X Y
0 1.0 4.0
1 2.0 NaN
2 3.0 8.0
3 NaN NaN
4 3.0 3.0
Filled DataFrame:
X Y
0 1.0 4.0
1 2.0 5.0
2 3.0 8.0
3 5.0 5.0
4 3.0 3.0
It fills all NaN
values in DataFrame
with 5
provided as an argument in the pandas.DataFrame.fillna()
method.
DataFrame.fillna()
With Mean
It would be also good idea to replace NaN
values of a column by mean of that column.
import pandas as pd
import numpy as np
df = pd.DataFrame({'X': [1, 2, 3, np.nan, 3],
'Y': [4, np.nan, 8, np.nan, 3]})
print("DataFrame:")
print(df)
df.fillna(df.mean(),inplace=True)
print("Filled DataFrame:")
print(df)
Output:
DataFrame:
X Y
0 1.0 4.0
1 2.0 NaN
2 3.0 8.0
3 NaN NaN
4 3.0 3.0
Filled DataFrame:
X Y
0 1.00 4.0
1 2.00 5.0
2 3.00 8.0
3 2.25 5.0
4 3.00 3.0
It fills NaN
values of column X
by mean of column X
and NaN
values of column Y
by mean of column Y
.
Due to inplace=True
, the original DataFrame
is modified after calling fillna()
function.
DataFrame.fillna()
With 0
import pandas as pd
import numpy as np
df = pd.DataFrame({'X': [1, 2, 3, np.nan, 3],
'Y': [4, np.nan, 8, np.nan, 3]})
print("DataFrame:")
print(df)
df.fillna(0,inplace=True)
print("Filled DataFrame:")
print(df)
Output:
DataFrame:
X Y
0 1.0 4.0
1 2.0 NaN
2 3.0 8.0
3 NaN NaN
4 3.0 3.0
Filled DataFrame:
X Y
0 1.0 4.0
1 2.0 0.0
2 3.0 8.0
3 0.0 0.0
4 3.0 3.0
It fills all NaN
with 0
.
Example Codes: DataFrame.fillna()
Method With the method
Parameter
We can also fill NaN
values in DataFrame
using different choices of the method
parameter.
import pandas as pd
import numpy as np
df = pd.DataFrame({'X': [1, 2, 3, np.nan, 3],
'Y': [4, np.nan, 8, np.nan, 3]})
print("DataFrame:")
print(df)
filled_df = df.fillna(method="backfill")
print("Filled DataFrame:")
print(filled_df)
Output:
DataFrame:
X Y
0 1.0 4.0
1 2.0 NaN
2 3.0 8.0
3 NaN NaN
4 3.0 3.0
Filled DataFrame:
X Y
0 1.0 4.0
1 2.0 8.0
2 3.0 8.0
3 3.0 3.0
4 3.0 3.0
Setting method="backfill"
fills all the NaN
values of DataFrame
with the value after NaN
value in the same column.
We can also use bfill
, pad
and ffill
methods to fill NaN
values in DataFrame
.
method |
Description |
---|---|
backfill / bfill |
fill all the NaN values of DataFrame with the value after NaN value in the same column. |
ffill / pad |
fill all the NaN values of DataFrame with the value before NaN value in the same column. |
Example Codes: DataFrame.fillna()
Method With limit
Parameter
limit
parameter in DataFrame.fillna()
method restricts the maximum number of consecutive NaN
values to be filled by the method.
import pandas as pd
import numpy as np
df = pd.DataFrame({'X': [1, 2,np.nan, 3,3],
'Y': [4, np.nan, 8, np.nan, 3]})
print("DataFrame:")
print(df)
filled_df = df.fillna(3,limit=1)
print("Filled DataFrame:")
print(filled_df)
Output:
DataFrame:
X Y
0 1.0 4.0
1 2.0 NaN
2 NaN 8.0
3 3.0 NaN
4 3.0 3.0
Filled DataFrame:
X Y
0 1.0 4.0
1 2.0 3.0
2 3.0 8.0
3 3.0 NaN
4 3.0 3.0
Here, once a NaN
is filled in a column, the other NaN
value in the same column remains as it is.
Suraj Joshi is a backend software engineer at Matrice.ai.
LinkedIn