Pandas DataFrame DataFrame.aggregate() Function

Minahil Noor Jan 30, 2023
  1. Syntax of pandas.DataFrame.aggregate()
  2. Example Codes: Pandas DataFrame.aggregate()
  3. Example Codes: DataFrame.aggregate() With the Multiple Functions
  4. Example Codes: DataFrame.aggregate() With a Specified Column
Pandas DataFrame DataFrame.aggregate() Function

pandas.DataFrame.aggregate() function aggregates the columns or rows of a DataFrame. The most commonly used aggregation functions are min, max, and sum. These aggregation functions result in the reduction of the size of the DataFrame.

Syntax of pandas.DataFrame.aggregate()

DataFrame.aggregate(func, axis, *args, **kwargs)

Parameters

func It is the aggregation function to be applied. It can be a callable or a list of callables, string or a list of strings, or a dictionary.
axis 0 by default. If it is 0 or 'index' then the function is applied to the individual columns. If it is 1 or 'columns' then the function is applied to the individual rows
*args It is a positional argument.
**kwargs It is a keyword argument.

Return

This function returns a scalar, Series, or a DataFrame.

  • It returns a scalar if a single function is called with Series.agg().
  • It returns a Series if a single function is called with DataFrame.agg().
  • It returns a DataFrame if multiple functions are called with DataFrame.agg().

Example Codes: Pandas DataFrame.aggregate()

DataFrame.agg() is an alias for DataFrame.aggregate(). It’s better to use the alias. So we will be using DataFrame.agg() in the example codes.

import pandas as pd

dataframe=pd.DataFrame({'Attendance': {0: 60, 1: 100, 2: 80,3: 78,4: 95},
                    'Name': {0: 'Olivia', 1: 'John', 2: 'Laura',3: 'Ben',4: 'Kevin'},
                    'Obtained Marks': {0: 90, 1: 75, 2: 82, 3: 64, 4: 45}})
print(dataframe)

The example DataFrame is below.

   Attendance    Name Obtained Marks
0          60  Olivia            90
1         100    John            75
2          80   Laura            82
3          78     Ben            64
4          95   Kevin            45

We will first check the DataFrame.agg() function using only a single aggregation function.

import pandas as pd

dataframe = pd.DataFrame(
    {
        "Attendance": {0: 60, 1: 100, 2: 80, 3: 78, 4: 95},
        "Name": {0: "Olivia", 1: "John", 2: "Laura", 3: "Ben", 4: "Kevin"},
        "Obtained Marks": {0: 90, 1: 75, 2: 82, 3: 64, 4: 45},
    }
)

dataframe1 = dataframe.agg("sum")
print(dataframe1)

Output:

Attendance                            413
Name              OliviaJohnLauraBenKevin
Obtained Marks                        356
dtype: object

The aggregate function sum is applied to the individual columns.

For the integer-type column, it has generated sum; and for the string-type column, it has concatenated the strings. dtype: object shows that a Series is returned.

Example Codes: DataFrame.aggregate() With the Multiple Functions

import pandas as pd

dataframe = pd.DataFrame(
    {
        "Attendance": {0: 60, 1: 100, 2: 80, 3: 78, 4: 95},
        "Name": {0: "Olivia", 1: "John", 2: "Laura", 3: "Ben", 4: "Kevin"},
        "Obtained Marks": {0: 90, 1: 75, 2: 82, 3: 64, 4: 45},
    }
)

dataframe1 = dataframe.agg(["sum", "min"])
print(dataframe1)

Output:

     Attendance                     Name  Obtained Marks
sum         413  OliviaJohnLauraBenKevin             356
min          60                      Ben              45

The aggregation functions sum and min are applied to the individual columns.

For the integer-type column, min function has generated the minimum value, and for the string-type column, it has shown the string with minimum length.

Example Codes: DataFrame.aggregate() With a Specified Column

import pandas as pd

dataframe = pd.DataFrame(
    {
        "Attendance": {0: 60, 1: 100, 2: 80, 3: 78, 4: 95},
        "Name": {0: "Olivia", 1: "John", 2: "Laura", 3: "Ben", 4: "Kevin"},
        "Obtained Marks": {0: 90, 1: 75, 2: 82, 3: 64, 4: 45},
    }
)

dataframe1 = dataframe.agg({"Obtained Marks": "sum"})
print(dataframe1)

Output:

Obtained Marks    356
dtype: int64

The sum of a single column is returned. dtype: int64 shows that this function has returned a Series.

We could also apply multiple functions on a single column.

import pandas as pd

dataframe = pd.DataFrame(
    {
        "Attendance": {0: 60, 1: 100, 2: 80, 3: 78, 4: 95},
        "Name": {0: "Olivia", 1: "John", 2: "Laura", 3: "Ben", 4: "Kevin"},
        "Obtained Marks": {0: 90, 1: 75, 2: 82, 3: 64, 4: 45},
    }
)
dataframe1 = dataframe.agg({"Obtained Marks": ["sum", "max"]})
print(dataframe1)

Output:

     Obtained Marks
sum             356
max              90

Related Article - Pandas DataFrame