How to Flatten a Hierarchical Index in Columns in Pandas
-
Use
rest_index()
to Flatten a Hierarchical Index in Columns in Pandas -
Use
as_index
to Flatten a Hierarchical Index in Columns in Pandas
This article will discuss how to flatten a hierarchical index in Pandas Dataframe columns.
Groupby aggregation functions are commonly used to create hierarchical indexes. The aggregated function used will be visible in the hierarchical index of the resulting dataframe.
We will use different functions to explain how to flatten a hierarchical index in columns.
Use rest_index()
to Flatten a Hierarchical Index in Columns in Pandas
The reset_index()
function in Pandas flattens the hierarchical index created by the groupby aggregation function.
Syntax:
pandas.DataFrame.reset_index(level, drop, inplace)
Where:
level
: Only the levels indicated are deleted from the index.drop
: The index is reset to the default integer index.inplace
: Without making a copy, modifies the dataframe object permanently.
We use the Pandas groupby()
function to group bus sales data by quarters and the reset_index()
pandas function to flatten the grouped dataframe’s hierarchical indexed columns.
First, import the Python Pandas library and then create a simple dataframe. The dataframe is stored in a data_bus
variable.
import pandas as pd
data_bus = pd.DataFrame(
{
"bus": ["2x", "3Tr", "4x", "5x"],
"bus_sale_q1": [21, 23, 25, 27],
"bus_sale_q2": [12, 14, 16, 18],
},
columns=["bus", "bus_sale_q1", "bus_sale_q2"],
)
print(data_bus)
Output:
bus bus_sale_q1 bus_sale_q2
0 2x 21 12
1 3Tr 23 14
2 4x 25 16
3 5x 27 18
The above output shows the simple dataframe created. After that, use the groupby()
function to group the bus column on the sum of sales q1 and q2.
grouped_data = data_bus.groupby(by="bus").agg("sum")
grouped_data
Output:
bus bus_sale_q1 bus_sale_q2
2x 21 12
3Tr 23 14
4x 25 16
5x 27 18
We will use the reset_index()
function to flatten the hierarchical index column.
flat_data = grouped_data.reset_index()
flat_data
Output:
bus bus_sale_q1 bus_sale_q2
0 2x 21 12
1 3Tr 23 14
2 4x 25 16
3 5x 27 18
Use as_index
to Flatten a Hierarchical Index in Columns in Pandas
The pandas groupby()
function will be used to group bus sales data by quarters, and as_index
will flatten the hierarchical indexed columns of the grouped dataframe.
Syntax:
pandas.DataFrame.groupby(by, level, axis, as_index)
Where:
level
: Columns on which the groupby operation must be performed.by
: Columns on which the groupby operation must be performed.axis
: Whether to split along rows (0) or columns (1).as_index
: For aggregated output, returns an object with the index group labels.
We’ll use the Pandas groupby()
function to group bus sales data by quarters and set the as_index
parameter to False. This ensures that the grouped dataframe’s hierarchical index is flattened.
The same dataframe will be used as in the previous example.
Example:
import pandas as pd
data_bus = pd.DataFrame(
{
"bus": ["2x", "3Tr", "4x", "5x"],
"bus_sale_q1": [21, 23, 25, 27],
"bus_sale_q2": [12, 14, 16, 18],
},
columns=["bus", "bus_sale_q1", "bus_sale_q2"],
)
data_bus
grouped_data = data_bus.groupby(by="bus", as_index=False).agg("sum")
print(grouped_data)
Output:
bus bus_sale_q1 bus_sale_q2
0 2x 21 12
1 3Tr 23 14
2 4x 25 16
3 5x 27 18