How to Count the NaN Occurrences in a Column in Pandas Dataframe
-
isna()
Method to CountNaN
in One or Multiple Columns -
Subtract the Count of
non-NaN
From the Total Length to CountNaN
Occurrences -
df.isnull().sum()
Method to CountNaN
Occurrences -
Count
NaN
Occurrences in the Whole PandasDataFrame
We will introduce the methods to count the NaN
occurrences in a column in the Pandas DataFrame
. We have many solutions including the isna()
method for one or multiple columns, by subtracting the total length from the count of NaN
occurrences, by using the value_counts
method and by using df.isnull().sum()
method.
We will also introduce the method to calculate the total number of NaN
occurrences in the whole Pandas DataFrame
.
isna()
Method to Count NaN
in One or Multiple Columns
We can use the insna()
method (pandas versions > 0.21.0) and then sum to count the NaN
occurrences. For one column we will do as follow:
import pandas as pd
s = pd.Series([1, 2, 3, np.nan, np.nan])
s.isna().sum()
# or s.isnull().sum() for older pandas versions
Output:
2
For several columns, it also works:
import pandas as pd
df = pd.DataFrame({"a": [1, 2, np.nan], "b": [np.nan, 1, np.nan]})
df.isna().sum()
Output:
a 1
b 2
dtype: int64
Subtract the Count of non-NaN
From the Total Length to Count NaN
Occurrences
We can get the number of NaN
occurrences in each column by subtracting the count of non-Nan
occurrences from the length of DataFrame
:
import pandas as pd
df = pd.DataFrame(
[(1, 2, None), (None, 4, None), (5, None, 7), (5, None, None)],
columns=["a", "b", "d"],
index=["A", "B", "C", "D"],
)
print(df)
print(len(df) - df.count())
Output:
a b d
A 1.0 2.0 NaN
B NaN 4.0 NaN
C 5.0 NaN 7.0
D 5.0 NaN NaN
a 1
b 2
d 3
dtype: int64
df.isnull().sum()
Method to Count NaN
Occurrences
We can get the number of NaN
occurrences in each column by using df.isnull().sum()
method. If we pass the axis=0
inside the sum
method, it will give the number of NaN
occurrences in every column. If we need NaN
occurrences in every row, set axis=1
.
Example Codes:
import pandas as pd
df = pd.DataFrame(
[(1, 2, None), (None, 4, None), (5, None, 7), (5, None, None)],
columns=["a", "b", "d"],
index=["A", "B", "C", "D"],
)
print("NaN occurrences in Columns:")
print(df.isnull().sum(axis=0))
print("NaN occurrences in Rows:")
print(df.isnull().sum(axis=1))
Output:
NaN occurrences in Columns:
a 1
b 2
d 3
dtype: int64
NaN occurrences in Rows:
A 1
B 2
C 1
D 2
dtype: int64
Count NaN
Occurrences in the Whole Pandas DataFrame
To get the total number of all NaN
occurrences in the DataFrame
, we chain two .sum()
methods together:
import pandas as pd
df = pd.DataFrame(
[(1, 2, None), (None, 4, None), (5, None, 7), (5, None, None)],
columns=["a", "b", "d"],
index=["A", "B", "C", "D"],
)
print("NaN occurrences in DataFrame:")
print(df.isnull().sum().sum())
Output:
NaN occurrences in DataFrame:
6