How to Find Quantiles in Pandas
In this tutorial, we will be learning how to obtain quantiles of a data frame in Pandas. We will be using the dataframe.quantile()
function to perform this task.
The numpy.percentile
and dataframe.quantile()
functions in Pandas return values at the provided quantile over the requested axis.
We need to import the Pandas library to get started.
import pandas as pd
Find Quantiles in Pandas
Let us now create a sample dataframe with four columns that contain numbers over which we want to perform the quantile operation.
We create our sample dataframe using the code below.
df = pd.DataFrame(
{
"A": [1, 5, 3, 4, 2],
"B": [3, 2, 4, 3, 4],
"C": [2, 2, 7, 3, 4],
"D": [4, 3, 6, 12, 7],
}
)
print(df)
Let us look at our sample dataframe.
A B C D
0 1 3 2 4
1 5 2 2 3
2 3 4 7 6
3 4 3 3 12
4 2 4 4 7
Now we will find the quantiles for our dataframe. Firstly we will use the dataframe.quantile()
function to find the quantile of .2
for all columns in the dataframe.
Use the quantile()
Function
We do this using the below code wherein we pass the first parameter for the function as .2
and pass the axis parameter as 0
so that the quantiles are calculated in columns.
df1 = df.quantile(0.2, axis=0)
print(df1)
Now let us see the quantile representations of our dataframe that we obtained.
A 1.8
B 2.8
C 2.0
D 3.8
We can also find the (.1,.25,.5,.75)
quantiles along the index axis, using the quantile()
function.
To do this, we pass the list of quantile values we want to obtain quantiles for.
df1 = df.quantile([0.1, 0.25, 0.5, 0.75], axis=0)
print(df1)
Now let’s look at the newly obtained quantile representations.
A B C D
0.10 1.4 2.4 2.0 3.4
0.25 2.0 3.0 2.0 4.0
0.50 3.0 3.0 3.0 6.0
0.75 4.0 4.0 4.0 7.0
We have successfully obtained the quantile values along the index axis for the mentioned quantile values.
Therefore, we can find quantiles over columns and index axis in Pandas using the following approach.