Pandas contam valores únicos
-
Contar valores únicos num DataFrame utilizando
Series.value_counts()
-
Contagem de valores únicos num DataFrame utilizando
DataFrame.nunique()
Este tutorial explica como podemos contar todos os valores únicos num DataFrame utilizando os métodos Series.value_counts()
e DataFrame.nunique()
.
import pandas as pd
patients_df = pd.DataFrame(
{
"Name": ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
"Date": [
"2020-12-01",
"2020-12-01",
"2020-12-02",
"2020-12-02",
"2020-12-02",
"2020-12-03",
],
"Age": [17, 18, 17, 16, 18, 16],
}
)
print(patients_df)
Resultado:
Name Date Age
0 Jennifer 2020-12-01 17
1 Travis 2020-12-01 18
2 Bob 2020-12-02 17
3 Emma 2020-12-02 16
4 Luna 2020-12-02 18
5 Anish 2020-12-03 16
Utilizaremos o DataFrame patients_df
, que contém os nomes dos pacientes, a sua data de consulta, e idade, para explicar como podemos obter a contagem de todos os valores únicos num DataFrame.
Contar valores únicos num DataFrame utilizando Series.value_counts()
import pandas as pd
patients_df = pd.DataFrame(
{
"Name": ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
"Date": [
"2020-12-01",
"2020-12-01",
"2020-12-02",
"2020-12-02",
"2020-12-02",
"2020-12-03",
],
"Age": [17, 18, 17, 16, 18, 16],
}
)
print("The DataFrame is:")
print(patients_df, "\n")
print("No of appointments for each date:")
print(patients_df["Date"].value_counts())
Resultado:
The DataFrame is:
Name Date Age
0 Jennifer 2020-12-01 17
1 Travis 2020-12-01 18
2 Bob 2020-12-02 17
3 Emma 2020-12-02 16
4 Luna 2020-12-02 18
5 Anish 2020-12-03 16
No of appointments for each date:
2020-12-02 3
2020-12-01 2
2020-12-03 1
Name: Date, dtype: int64
Mostra a contagem de cada valor único da coluna DataFrame
no DataFrame.
Contagem de valores únicos num DataFrame utilizando DataFrame.nunique()
import pandas as pd
patients_df = pd.DataFrame(
{
"Name": ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
"Date": [
"2020-12-01",
"2020-12-01",
"2020-12-02",
"2020-12-02",
"2020-12-02",
"2020-12-03",
],
"Age": [17, 18, 17, 16, 18, 16],
}
)
print(patients_df, "\n")
print(patients_df.groupby("Date").Name.nunique())
Resultado:
Name Date Age
0 Jennifer 2020-12-01 17
1 Travis 2020-12-01 18
2 Bob 2020-12-02 17
3 Emma 2020-12-02 16
4 Luna 2020-12-02 18
5 Anish 2020-12-03 16
Date
2020-12-01 2
2020-12-02 3
2020-12-03 1
Name: Name, dtype: int64
Divide o DataFrame com base no valor da coluna Date
, ou seja, linhas com o mesmo valor de Date
são colocadas no mesmo grupo e depois conta a ocorrência de cada nome num determinado grupo para saber a contagem de cada valor único da coluna Date
no DataFrame.
Suraj Joshi is a backend software engineer at Matrice.ai.
LinkedIn