Pandas DataFrame DataFrame.median() Function
-
Syntax of
pandas.DataFrame.median()
: -
Example Codes:
DataFrame.median()
Method to Find Median Along Column Axis -
Example Codes:
DataFrame.median()
Method to Find Median Along Row Axis -
Example Codes:
DataFrame.median()
Method to Find Median IgnoringNaN
Values
Python Pandas DataFrame.median()
function calculates the median of elements of DataFrame object along the specified axis.
The median is not mean
, but the middle of the values in the list of numbers.
Syntax of pandas.DataFrame.median()
:
DataFrame.median(axis=None, skipna=None, level=None, numeric_only=None, **kwargs)
Parameters
axis |
find median along the row (axis=0) or column (axis=1) |
skipna |
Boolean. Exclude NaN values (skipna=True ) or include NaN values (skipna=False ) |
level |
Count along with particular level if the axis is MultiIndex |
numeric_only |
Boolean. For numeric_only=True , include only float , int , and boolean columns |
**kwargs |
Additional keyword arguments to the function. |
Return
If the level
is not specified, return Series
of the median of the values for the requested axis, else return DataFrame
of median values.
Example Codes: DataFrame.median()
Method to Find Median Along Column Axis
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 7, 5, 10],
'Y': [4, 3, 8, 2, 9]})
print("DataFrame:")
print(df)
medians=df.median()
print("medians of Each Column:")
print(medians)
Output:
DataFrame:
X Y
0 1 4
1 2 3
2 7 8
3 5 2
4 10 9
medians of Each Column:
X 5.0
Y 4.0
dtype: float64
It calculates the median for both columns X
and Y
and finally returns a Series
object with the median of each column.
To find the median of a particular column of DataFrame
in Pandas, we call the median()
function for that column only.
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 7, 5, 10],
'Y': [4, 3, 8, 2, 9]})
print("DataFrame:")
print(df)
medians=df["X"].median()
print("medians of Each Column:")
print(medians)
Output:
DataFrame:
X Y
0 1 4
1 2 3
2 7 8
3 5 2
4 10 9
medians of Each Column:
5.0
It only gives the median of values of column X
of DataFrame
.
Example Codes: DataFrame.median()
Method to Find Median Along Row Axis
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 7, 5, 10],
'Y': [4, 3, 8, 2, 9],
'Z': [2, 7, 6, 10, 5]})
print("DataFrame:")
print(df)
medians=df.median(axis=1)
print("medians of Each Row:")
print(medians)
Output:
DataFrame:
X Y Z
0 1 4 2
1 2 3 7
2 7 8 6
3 5 2 10
4 10 9 5
medians of Each Row:
0 2.0
1 3.0
2 7.0
3 5.0
4 9.0
dtype: float64
It calculates the median for all the rows and finally returns a Series
object with the median of each row.
To find the median of a particular row of DataFrame
in Pandas, we call the median()
function for that row only.
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 7, 5, 10],
'Y': [4, 3, 8, 2, 9],
'Z': [2, 7, 6, 10, 5]})
print("DataFrame:")
print(df)
median=df.iloc[[0]].median(axis=1)
print("median of 1st Row:")
print(median)
Output:
DataFrame:
X Y Z
0 1 4 2
1 2 3 7
2 7 8 6
3 5 2 10
4 10 9 5
median of 1st Row:
0 2.0
dtype: float64
It only gives the median of values of 1st row of DataFrame
.
We use iloc
method to select rows based on the index.
Example Codes: DataFrame.median()
Method to Find Median Ignoring NaN
Values
We use the default value of skipna
parameter i.e. skipna=True
to find the median of DataFrame
along the specified axis by ignoring NaN
values.
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 7, None, 10, 8],
'Y': [None, 3, 8, 2, 9, 6],
'Z': [2, 7, 6, 10, None, 5]})
print("DataFrame:")
print(df)
median=df.median(skipna=True)
print("medians of Each Row:")
print(median)
Output:
DataFrame:
X Y Z
0 1.0 NaN 2.0
1 2.0 3.0 7.0
2 7.0 8.0 6.0
3 NaN 2.0 10.0
4 10.0 9.0 NaN
5 8.0 6.0 5.0
medians of Each Row:
X 7.0
Y 6.0
Z 6.0
dtype: float64
If we set skipna=True
, it ignores the NaN
in the dataframe. It allows us to calculate the median of DataFrame
along the column axis by ignoring NaN
values.
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 7, None, 10],
'Y': [5, 3, 8, 2, 9],
'Z': [2, 7, 6, 10, 4]})
print("DataFrame:")
print(df)
median=df.median(skipna=False)
print("medians of Each Row:")
print(median)
Output:
DataFrame:
X Y Z
0 1.0 5 2
1 2.0 3 7
2 7.0 8 6
3 NaN 2 10
4 10.0 9 4
medians of Each Row:
X NaN
Y 5.0
Z 6.0
dtype: float64
Here, we get NaN
value for the median of the column X
as column X
has NaN
value present in it.
Founder of DelftStack.com. Jinku has worked in the robotics and automotive industries for over 8 years. He sharpened his coding skills when he needed to do the automatic testing, data collection from remote servers and report creation from the endurance test. He is from an electrical/electronics engineering background but has expanded his interest to embedded electronics, embedded programming and front-/back-end programming.
LinkedIn Facebook