Pandas DataFrame DataFrame.sum() Function
-
Syntax of
pandas.DataFrame.sum()
: -
Example Codes:
DataFrame.sum()
Method to Calculate Sum Along Column Axis -
Example Codes:
DataFrame.sum()
Method to Find Sum Along Row Axis -
Example Codes:
DataFrame.sum()
Method to Find the Sum IgnoringNaN
Values -
Example Codes: Set
min_count
inDataFrame.sum()
Method
The function of the Python Pandas DataFrame.sum()
is to calculate the sum of values of DataFrame
object over the specified axis.
Syntax of pandas.DataFrame.sum()
:
DataFrame.sum(
axis=None, skipna=None, level=None, numeric_only=None, min_count=0, **kwargs
)
Parameters
axis |
find sum along the row (axis=0) or column (axis=1) |
skipna |
Boolean. Exclude NaN values (skipna=True ) or include NaN values (skipna=False ) |
level |
Count along with particular level if the axis is MultiIndex |
numeric_only |
Boolean. For numeric_only=True , include only float , int , and boolean columns |
min_count |
Integer. Minimum number of non-NaN values to calculate the sum. If this condition is not satisfied, the sum will be NaN |
**kwargs |
Additional keyword arguments to the function. |
Return
If the level
is not specified, return Series
of the sum of the values for the requested axis, else return DataFrame
of sum values.
Example Codes: DataFrame.sum()
Method to Calculate Sum Along Column Axis
import pandas as pd
df = pd.DataFrame({'X':
[1,2,3,4,5],
'Y': [1, 2, 3,4,5],
'Z': [3,4,5,6,3]})
print("DataFrame:")
print(df)
sums=df.sum()
print("Column-wise Sum:")
print(sums)
Output:
DataFrame:
X Y Z
0 1 1 3
1 2 2 4
2 3 3 5
3 4 4 6
4 5 5 3
Column-wise Sum:
X 15
Y 15
Z 21
dtype: int64
It calculates the sum for all the columns X
, Y
, and Z
and finally returns a Series
object with the sum of each column.
To find the sum of a particular column of DataFrame
in Pandas, you need to call the sum()
function for that column only.
import pandas as pd
df = pd.DataFrame({'X':
[1,2,3,4,5],
'Y': [1, 2, 3,4,5],
'Z': [3,4,5,6,3]})
print("DataFrame:")
print(df)
sums=df["Z"].sum()
print("Sum of values of Z-column:")
print(sums)
Output:
DataFrame:
X Y Z
0 1 1 3
1 2 2 4
2 3 3 5
3 4 4 6
4 5 5 3
Sum of values of Z-column:
21
It only gives the sum of values of column Z
of DataFrame
.
Example Codes: DataFrame.sum()
Method to Find Sum Along Row Axis
import pandas as pd
df = pd.DataFrame({'X':
[1,2,3,4,5],
'Y': [1, 2, 3,4,5],
'Z': [3,4,5,6,3]})
print("DataFrame:")
print(df)
sums=df.sum(axis=1)
print("Row-wise sum:")
print(sums)
Output:
DataFrame:
X Y Z
0 1 1 3
1 2 2 4
2 3 3 5
3 4 4 6
4 5 5 3
Row-wise sum:
0 5
1 8
2 11
3 14
4 13
dtype: int64
It calculates the sum for all the rows and finally returns a Series
object with the sum of each row.
To find the sum of a particular row of DataFrame
in Pandas, you need to call the sum()
function for that specific row only.
import pandas as pd
df = pd.DataFrame({'X':
[1,2,3,4,5],
'Y': [1, 2, 3,4,5],
'Z': [3,4,5,6,3]})
print("DataFrame:")
print(df)
sum_3=df.iloc[[2]].sum(axis=1)
print("Sum of values of 3rd Row:")
print(sum_3)
Output:
DataFrame:
X Y Z
0 1 1 3
1 2 2 4
2 3 3 5
3 4 4 6
4 5 5 3
Sum of values of 3rd Row:
2 11
dtype: int64
It only gives the sum of values of the 3rd row of DataFrame
.
Use the iloc
method to select rows based on the index.
Example Codes: DataFrame.sum()
Method to Find the Sum Ignoring NaN
Values
Use the default value of the skipna
parameter i.e. skipna=True
to find the sum of DataFrame
along the specified axis, ignoring NaN
values.
import pandas as pd
df = pd.DataFrame({'X':
[1,None,3,4,5],
'Y': [1, None, 3,None,5],
'Z': [3,4,5,6,3]})
print("DataFrame:")
print(df)
sums=df.sum()
print("Column-wise Sum:")
print(sums)
Output:
DataFrame:
X Y Z
0 1.0 1.0 3
1 NaN NaN 4
2 3.0 3.0 5
3 4.0 NaN 6
4 5.0 5.0 3
Column-wise Sum:
X 13.0
Y 9.0
Z 21.0
dtype: float64
If you set skipna=True
, you’ll get NaN
values of sums if the DataFrame has NaN
values.
import pandas as pd
df = pd.DataFrame({'X':
[1,None,3,4,5],
'Y': [1, None, 3,None,5],
'Z': [3,4,5,6,3]})
print("DataFrame:")
print(df)
sums=df.sum(skipna=False)
print("Column-wise Sum:")
print(sums)
Output:
DataFrame:
X Y Z
0 1.0 1.0 3
1 NaN NaN 4
2 3.0 3.0 5
3 4.0 NaN 6
4 5.0 5.0 3
Column-wise Sum:
X NaN
Y NaN
Z 21.0
dtype: float64
Here, you get the NaN
value for the sum of columns X
and Y
as both of them have the NaN
values in them.
Example Codes: Set min_count
in DataFrame.sum()
Method
import pandas as pd
df = pd.DataFrame({'X':
[1,None,3,4,5],
'Y': [1, None, 3,None,5],
'Z': [3,4,5,6,3]})
print("DataFrame:")
print(df)
sums=df.sum(min_count=4)
print("Column-wise Sum:")
print(sums)
Output:
DataFrame:
X Y Z
0 1.0 1.0 3
1 NaN NaN 4
2 3.0 3.0 5
3 4.0 NaN 6
4 5.0 5.0 3
Column-wise Sum:
X 13.0
Y NaN
Z 21.0
dtype: float64
Here, you get the NaN
value for the sum of column Y
as column Y
has only 3
non- NaN
values, which is less than the value of the min_count
parameter.
Suraj Joshi is a backend software engineer at Matrice.ai.
LinkedIn