How to Count the NaN Occurrences in a Column in Pandas Dataframe

Asad Riaz Feb 02, 2024
  1. isna() Method to Count NaN in One or Multiple Columns
  2. Subtract the Count of non-NaN From the Total Length to Count NaN Occurrences
  3. df.isnull().sum() Method to Count NaN Occurrences
  4. Count NaN Occurrences in the Whole Pandas DataFrame
How to Count the NaN Occurrences in a Column in Pandas Dataframe

We will introduce the methods to count the NaN occurrences in a column in the Pandas DataFrame. We have many solutions including the isna() method for one or multiple columns, by subtracting the total length from the count of NaN occurrences, by using the value_counts method and by using df.isnull().sum() method.

We will also introduce the method to calculate the total number of NaN occurrences in the whole Pandas DataFrame.

isna() Method to Count NaN in One or Multiple Columns

We can use the insna() method (pandas versions > 0.21.0) and then sum to count the NaN occurrences. For one column we will do as follow:

import pandas as pd

s = pd.Series([1, 2, 3, np.nan, np.nan])
s.isna().sum()
# or s.isnull().sum() for older pandas versions

Output:

2

For several columns, it also works:

import pandas as pd

df = pd.DataFrame({"a": [1, 2, np.nan], "b": [np.nan, 1, np.nan]})
df.isna().sum()

Output:

a    1
b    2
dtype: int64

Subtract the Count of non-NaN From the Total Length to Count NaN Occurrences

We can get the number of NaN occurrences in each column by subtracting the count of non-Nan occurrences from the length of DataFrame:

import pandas as pd

df = pd.DataFrame(
    [(1, 2, None), (None, 4, None), (5, None, 7), (5, None, None)],
    columns=["a", "b", "d"],
    index=["A", "B", "C", "D"],
)
print(df)
print(len(df) - df.count())

Output:

     a    b    d
A  1.0  2.0  NaN
B  NaN  4.0  NaN
C  5.0  NaN  7.0
D  5.0  NaN  NaN
a    1
b    2
d    3
dtype: int64

df.isnull().sum() Method to Count NaN Occurrences

We can get the number of NaN occurrences in each column by using df.isnull().sum() method. If we pass the axis=0 inside the sum method, it will give the number of NaN occurrences in every column. If we need NaN occurrences in every row, set axis=1.

Example Codes:

import pandas as pd

df = pd.DataFrame(
    [(1, 2, None), (None, 4, None), (5, None, 7), (5, None, None)],
    columns=["a", "b", "d"],
    index=["A", "B", "C", "D"],
)

print("NaN occurrences in Columns:")
print(df.isnull().sum(axis=0))
print("NaN occurrences in Rows:")
print(df.isnull().sum(axis=1))

Output:

NaN occurrences in Columns:
a    1
b    2
d    3
dtype: int64
NaN occurrences in Rows:
A    1
B    2
C    1
D    2
dtype: int64

Count NaN Occurrences in the Whole Pandas DataFrame

To get the total number of all NaN occurrences in the DataFrame, we chain two .sum() methods together:

import pandas as pd

df = pd.DataFrame(
    [(1, 2, None), (None, 4, None), (5, None, 7), (5, None, None)],
    columns=["a", "b", "d"],
    index=["A", "B", "C", "D"],
)

print("NaN occurrences in DataFrame:")
print(df.isnull().sum().sum())

Output:

NaN occurrences in DataFrame:
6

Related Article - Pandas DataFrame