How to Unpivot a Python Pandas Dataframe
Python is a preferred language for data analysis due to its great data-centric packages ecosystem. One of these packages is Pandas, which significantly simplifies data import and analysis.
This article uses the Pandas dataframe melt()
function to convert a dataframe from wide to long format while optionally leaving identifier variables set.
Use the melt()
Function to Unpivot a Python Pandas Dataframe
This function can convert a dataframe into a format in which one or more columns are identifier variables (id vars). All other columns, considered measured variables (value vars), are unpivoted
to the row axis, leaving only two non-identifier columns, variable
and value
.
Syntax:
DataFrame.melt(id_vars=None, value_vars=None, var_name=None, value_name='value', col_level=None)
Parameters:
-
frame
: This is the DataFrame we use. -
id_vars
: This is the column we use as an identifier variable. -
value_vars
: This is used to unpivot. var_name
: This specifies a name for the variable column. If none is specified, the frame is used, variable, or columns name.-
value_name
: The value column should be given a unique name. -
col_level
: Use this level to melt the columns if they are MultiIndexes. Returns
: Convert a DataFrame into a format with one or more identifier variables in each column.
Example 1:
First, create a dataframe with columns A
, Bsc
, Cse
, and xyz
.
import pandas as pd
df = pd.DataFrame(
{
"A": [12, 4, 5, 44, 1],
"Bsc": [5, 2, 54, 3, 2],
"Cse": [20, 16, 7, 3, 8],
"xyz": [14, 3, 17, 2, 6],
}
)
df
Output:
| | A | Bsc | Cse | xyz |
| ---- | ---- | ---- | ---- | ---- |
| 0 | 12 | 5 | 20 | 14 |
| 1 | 4 | 2 | 16 | 3 |
| 2 | 5 | 54 | 7 | 17 |
| 3 | 44 | 3 | 3 | 2 |
| 4 | 1 | 2 | 8 | |
The below function unpivots the dataframe. Here A
is an identifier, and the Bsc
column is the value variable.
df.melt(id_vars=["A"], value_vars=["Bsc"])
Output:
| | A | variable | value |
| ---- | ---- | -------- | ----- |
| 0 | 12 | Bsc | 5 |
| 1 | 4 | Bsc | 2 |
| 2 | 5 | Bsc | 54 |
| 3 | 44 | Bsc | 3 |
| 4 | 1 | Bsc | 2 |
Example 2:
Set column A
as the identifier variable and columns Bsc
and Csc
as the value variables using the dataframe.melt()
function.
df.melt(
id_vars=["A"],
value_vars=["Bsc", "Cse"],
var_name="Variable_column",
value_name="Value_column",
)
Output:
| index | A | Variable_column | Value_column |
| ----- | ---- | --------------- | ------------ |
| 0 | 12 | Bsc | 5 |
| 1 | 4 | Bsc | 2 |
| 2 | 5 | Bsc | 54 |
| 3 | 44 | Bsc | 3 |
| 4 | 1 | Bsc | 2 |
| 5 | 12 | Cse | 20 |
| 6 | 4 | Cse | 16 |
| 7 | 5 | Cse | 7 |
| 8 | 44 | Cse | 3 |
| 9 | 1 | Cse | |
Conclusion
In this article, we discussed how to unpivot a dataframe or convert a dataframe into a long format to increase the number of rows.
For this purpose, we used the melt()
function. One column is designated as a variable identifier, while the other is the identifier’s value.