How to Set Columns as Index in Pandas Dataframe
-
Use
set_index()
to Make Column as the Index in Pandas DataFrame -
Use the
index_col
Parameter inread_excel
orread_csv
to Set Column as the Index in Pandas DataFrame
Usually, in a Pandas Dataframe, we have serial numbers from 0 to the length of the object as the index by default. We can also make a specific column of a dataframe as its index. For this, we can use the set_index()
provided in pandas, and we can also specify the column index while importing a dataframe from excel or CSV file.
Use set_index()
to Make Column as the Index in Pandas DataFrame
The set_index()
method can be applied to lists, series, or dataframes to alter their index. For Dataframes, set_index()
can also make multiple columns as their index.
Example:
import pandas as pd
import numpy as np
colnames = ["Name", "Time", "Course"]
df = pd.DataFrame(
[["Jay", 10, "B.Tech"], ["Raj", 12, "BBA"], ["Jack", 11, "B.Sc"]], columns=colnames
)
print(df)
Output:
Name Time Course
0 Jay 10 B.Tech
1 Raj 12 BBA
2 Jack 11 B.Sc
The syntax for making columns as index:
dataframe.set_index(Column_name, inplace=True)
Make a single column as the index using set_index()
:
import pandas as pd
import numpy as np
colnames = ["Name", "Time", "Course"]
df = pd.DataFrame(
[["Jay", 10, "B.Tech"], ["Raj", 12, "BBA"], ["Jack", 11, "B.Sc"]], columns=colnames
)
df.set_index("Name", inplace=True)
print(df)
Output:
Time Course
Name
Jay 10 B.Tech
Raj 12 BBA
Jack 11 B.Sc
Make multiple columns as index:
import pandas as pd
import numpy as np
colnames = ["Name", "Time", "Course"]
df = pd.DataFrame(
[["Jay", 10, "B.Tech"], ["Raj", 12, "BBA"], ["Jack", 11, "B.Sc"]], columns=colnames
)
df.set_index(["Name", "Course"], inplace=True)
print(df)
Output:
Time
Name Course
Jay B.Tech 10
Raj BBA 12
Jack B.Sc 11
Use the index_col
Parameter in read_excel
or read_csv
to Set Column as the Index in Pandas DataFrame
While reading a dataframe from an excel or CSV file, we can specify the column which we want as the index of the DataFrame.
Example:
import pandas as pd
import numpy as np
df = pd.read_excel("data.xlsx", index_col=2)
print(df)
Output:
Name Time
Course
B.Tech Mark 12
BBA Jack 10
B.Sc Jay 11
Manav is a IT Professional who has a lot of experience as a core developer in many live projects. He is an avid learner who enjoys learning new things and sharing his findings whenever possible.
LinkedIn