How to Count Unique Values Per Group(s) in Pandas

Ahmed Waheed Feb 02, 2024 Pandas Pandas DataFrame

df.groupby().nunique() Method
df.groupby().agg() Method
df.groupby().unique() Method

How to Count Unique Values Per Group(s) in Pandas

When we are working with large data sets, sometimes we have to apply some function to a specific group of data. For example, we have a data set of countries and the private code they use for private matters. We want to count the number of codes a country uses. Listed below are the different methods from groupby() to count unique values.

We will use the same DataFrame in the next sections as follows,

import pandas as pd

data = [
    [999, "Switzerland"],
    [113, "Switzerland"],
    [112, "Japan"],
    [112, "Switzerland"],
    [113, "Canada"],
    [114, "Japan"],
    [100, "Germany"],
    [114, "Japan"],
    [115, "Germany"],
]
df = pd.DataFrame(data, columns=["code", "Countries"])
print(df)

The following will be the output.

   code    Countries
0   999  Switzerland
1   113  Switzerland
2   112        Japan
3   112  Switzerland
4   113       Canada
5   114        Japan
6   100      Germany
7   114        Japan
8   115      Germany

`df.groupby().nunique()` Method

Let’s see how df.groupby().nunique() function will groupby our countries.

import pandas as pd

data = [
    [999, "Switzerland"],
    [113, "Switzerland"],
    [112, "Japan"],
    [112, "Switzerland"],
    [113, "Canada"],
    [114, "Japan"],
    [100, "Germany"],
    [114, "Japan"],
    [115, "Germany"],
]

df = pd.DataFrame(data, columns=["code", "Countries"])
result = df.groupby("Countries")["code"].nunique()
print(result)

The following will be output.

Countries
Canada         1
Germany        2
Japan          2
Switzerland    3
Name: code, dtype: int64

This shows that Canada is using one code, Germany is using two codes, and so on.

`df.groupby().agg()` Method

This method works same as df.groupby().nunique(). We need pass nunique() function to agg() function.

import pandas as pd

data = [
    [999, "Switzerland"],
    [113, "Switzerland"],
    [112, "Japan"],
    [112, "Switzerland"],
    [113, "Canada"],
    [114, "Japan"],
    [100, "Germany"],
    [114, "Japan"],
    [115, "Germany"],
]
df = pd.DataFrame(data, columns=["code", "Countries"])
result = df.groupby(by="Countries", as_index=False).agg({"code": pd.Series.nunique})
print(result)

Output:

     Countries  code
0       Canada     1
1      Germany     2
2        Japan     2
3  Switzerland     3

.agg({'code': pd.Series.nunique})

It gggregates using function pd.Series.nunique over the column code.

`df.groupby().unique()` Method

This method is useful when you want to see which country is using which codes.

import pandas as pd

data = [
    [999, "Switzerland"],
    [113, "Switzerland"],
    [112, "Japan"],
    [112, "Switzerland"],
    [113, "Canada"],
    [114, "Japan"],
    [100, "Germany"],
    [114, "Japan"],
    [115, "Germany"],
]
result = df.groupby("Countries")["code"].unique()
print(result)

The following will be output.

Countries
Canada                   [113]
Germany             [100, 115]
Japan               [112, 114]
Switzerland    [999, 113, 112]
Name: code, dtype: object

Enjoying our tutorials? Subscribe to DelftStack on YouTube to support us in creating more high-quality video guides. Subscribe

df.groupby().nunique() Method

df.groupby().agg() Method

df.groupby().unique() Method

Related Article - Pandas DataFrame

`df.groupby().nunique()` Method

`df.groupby().agg()` Method

`df.groupby().unique()` Method