How to Reshape a Data Frame Using stack() and unstack() Functions in Pandas
-
the
stack()
andunstack()
Functions in Pandas -
Using
unstack()
Function to Alter Our Data Frame -
Using
unstack()
Function to Alter Our Data Frame
Pandas is an advanced data analysis tool or a package extension in Python. It is highly recommended to use Pandas when we have data in a SQL table, a spreadsheet or heterogenous columns.
This article explores the basic concept of the stack and unstacking in Pandas. Stacking and unstacking are used in Pandas widely to alter the shape of the data frame under consideration.
Let us see this method in action. First, we will create a dummy data frame, dates_data
, along with a few rows.
import pandas as pd
index = pd.date_range("2013-1-1", periods=100, freq="30Min")
dates_data = pd.DataFrame(data=list(range(100)), columns=["value"], index=index)
dates_data["value2"] = "Alpha"
dates_data["value2"].loc[0:10] = "Beta"
The above code block creates a data frame dates_data
with dates and two columns named value
and value2
. Viewing the entries in the data, we use the following code:
print(dates_data)
Output:
value value2
2013-01-01 00:00:00 0 Beta
2013-01-01 00:30:00 1 Beta
2013-01-01 01:00:00 2 Beta
2013-01-01 01:30:00 3 Beta
2013-01-01 02:00:00 4 Beta
... ... ...
2013-01-02 23:30:00 95 Alpha
2013-01-03 00:00:00 96 Alpha
2013-01-03 00:30:00 97 Alpha
2013-01-03 01:00:00 98 Alpha
2013-01-03 01:30:00 99 Alpha
As we can see, we have 100 different entries with time set up equally after intervals of 30 minutes each.
Moreover, two additional columns named value
and value2
are created where we have some values set as numbers and others as either Alpha
or Beta
.
the stack()
and unstack()
Functions in Pandas
We can alter our data frame named dates_data
with the help of two functions named stack()
and unstack()
in Pandas. This function can help us change the orientation of the data frame such that the rows become columns and the columns become rows accordingly.
We will try to alter value
and value2
in our data frame as the rows and the values in those as the entries in our rows.
Using unstack()
Function to Alter Our Data Frame
Command:
dates_data = dates_data.unstack()
print(dates_data)
Output:
value 2013-01-01 00:00:00 0
2013-01-01 00:30:00 1
2013-01-01 01:00:00 2
2013-01-01 01:30:00 3
2013-01-01 02:00:00 4
...
value2 2013-01-02 23:30:00 Alpha
2013-01-03 00:00:00 Alpha
2013-01-03 00:30:00 Alpha
2013-01-03 01:00:00 Alpha
2013-01-03 01:30:00 Alpha
Length: 200, dtype: object
Now, we have successfully altered our data such that we now have our columns as row entries in our data.
Using unstack()
Function to Alter Our Data Frame
Command:
dates_data = dates_data.stack()
print(dates_data)
Output:
2013-01-01 00:00:00 value 0
value2 Beta
2013-01-01 00:30:00 value 1
value2 Beta
2013-01-01 01:00:00 value 2
...
2013-01-03 00:30:00 value2 Alpha
2013-01-03 01:00:00 value 98
value2 Alpha
2013-01-03 01:30:00 value 99
value2 Alpha
Length: 200, dtype: object
The column values are now stacked as rows in our data frame.
Therefore, with the help of the unstacking technique
in Pandas, we can efficiently filter data based on our requirement as and when needed and convert the look of our data frame that might be necessary to visualize the data in a better way.