Pandas DataFrame DataFrame.mean() Function
-
Syntax of
pandas.DataFrame.mean()
: -
Example Codes:
DataFrame.mean()
Method to Find Mean Along Column Axis -
Example Codes:
DataFrame.mean()
Method to Find Mean Along Row Axis -
Example Codes:
DataFrame.mean()
Method to Find the Mean IgnoringNaN
Values
Python Pandas DataFrame.mean()
function calculates mean of values of DataFrame object over the specified axis.
Syntax of pandas.DataFrame.mean()
:
DataFrame.mean(axis=None, skipna=None, level=None, numeric_only=None, **kwargs)
Parameters
axis |
find mean along the row (axis=0) or column (axis=1) |
skipna |
Boolean. Exclude NaN values (skipna=True ) or include NaN values (skipna=False ) |
level |
Count along with particular level if the axis is MultiIndex |
numeric_only |
Boolean. For numeric_only=True , include only float , int , and boolean columns |
**kwargs |
Additional keyword arguments to the function. |
Return
If the level
is not specified, return Series
of the mean of the values for the requested axis, else return DataFrame
of mean values.
Example Codes: DataFrame.mean()
Method to Find Mean Along Column Axis
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 2, 3],
'Y': [4, 3, 8, 4]})
print("DataFrame:")
print(df)
means=df.mean()
print("Means of Each Column:")
print(means)
Output:
DataFrame:
X Y
0 1 4
1 2 3
2 2 8
3 3 4
Means of Each Column:
X 2.00
Y 4.75
dtype: float64
It calculates mean for both columns X
and Y
and finally returns a Series
object with the mean of each column.
To find the mean of a particular column of DataFrame
in Pandas, we call the mean()
function for that column only.
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 2, 3],
'Y': [4, 3, 8, 4]})
print("DataFrame:")
print(df)
means=df["X"].mean()
print("Mean of Column X:")
print(means)
Output:
DataFrame:
X Y
0 1 4
1 2 3
2 2 8
3 3 4
Mean of Column X:
2.0
It only gives the mean of values of column X
of DataFrame
.
Example Codes: DataFrame.mean()
Method to Find Mean Along Row Axis
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 2, 3],
'Y': [4, 3, 8, 4]})
print("DataFrame:")
print(df)
means=df.mean(axis=1)
print("Mean of Rows:")
print(means)
Output:
DataFrame:
X Y
0 1 4
1 2 3
2 2 8
3 3 4
Mean of Rows:
0 2.5
1 2.5
2 5.0
3 3.5
dtype: float64
It calculates mean for all the rows and finally returns a Series
object with the mean of each row.
To find the mean of a particular row of DataFrame
in Pandas, we call the mean()
function for that row only.
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 2, 3],
'Y': [4, 3, 8, 4]})
print("DataFrame:")
print(df)
mean=df.iloc[[0]].mean(axis=1)
print("Mean of 1st Row:")
print(mean)
Output:
DataFrame:
X Y
0 1 4
1 2 3
2 2 8
3 3 4
Mean of 1st Row:
0 2.5
dtype: float64
It only gives the mean of values of the first row of DataFrame
.
We use iloc
method to select rows based on the index.
Example Codes: DataFrame.mean()
Method to Find the Mean Ignoring NaN
Values
We use the default value of skipna
parameter i.e. skipna=True
to find the mean of DataFrame
along the specified axis ignoring NaN
values.
import pandas as pd
df = pd.DataFrame({'X': [1, 2, None, 3],
'Y': [4, 3, None, 4]})
print("DataFrame:")
print(df)
means=df.mean(skipna=True)
print("Mean of Columns")
print(means)
Output:
DataFrame:
X Y
0 1.0 4.0
1 2.0 3.0
2 NaN NaN
3 3.0 4.0
Mean of Columns
X 2.000000
Y 3.666667
dtype: float64
If we set skipna=True
, it ignores the NaN
in the dataframe. It allows us to calculate the mean of DataFrame
along column axis ignoring NaN
values.
import pandas as pd
df = pd.DataFrame({'X': [1, 2, None, 3],
'Y': [4, 3, 3, 4]})
print("DataFrame:")
print(df)
means=df.mean(skipna=False)
print("Mean of Columns")
print(means)
Output:
DataFrame:
X Y
0 1.0 4
1 2.0 3
2 NaN 3
3 3.0 4
Mean of Columns
X NaN
Y 3.5
dtype: float64
Here, we get NaN
value for the mean of column X
as column X
has NaN
value present in it.
Suraj Joshi is a backend software engineer at Matrice.ai.
LinkedIn