How to Select Pandas DataFrame Columns
- Select Columns From a Pandas DataFrame Using Indexing Operation
-
Select Columnns From a Pandas DataFrame Using the
DataFrame.drop()
Method -
Select Columns From a Pandas DataFrame Using the
DataFrame.filter()
Method
This tutorial explains how we can select columns from a Pandas DataFrame by indexing or using the DataFrame.drop()
and DataFrame.filter()
methods.
We will use the DataFrame df
as below to explain how we can select columns from a Pandas DataFrame.
import pandas as pd
df = pd.DataFrame(
{
"A": [302, 504, 708, 103, 343, 565],
"B": [100, 300, 400, 200, 400, 700],
"C": [300, 400, 350, 100, 1000, 400],
"D": [10, 15, 5, 0, 2, 7],
"E": [4, 5, 6, 7, 8, 9],
}
)
print(df)
Output:
A B C D E
0 302 100 300 10 4
1 504 300 400 15 5
2 708 400 350 5 6
3 103 200 100 0 7
4 343 400 1000 2 8
5 565 700 400 7 9
Select Columns From a Pandas DataFrame Using Indexing Operation
import pandas as pd
df = pd.DataFrame(
{
"A": [302, 504, 708, 103, 343, 565],
"B": [100, 300, 400, 200, 400, 700],
"C": [300, 400, 350, 100, 1000, 400],
"D": [10, 15, 5, 0, 2, 7],
"E": [4, 5, 6, 7, 8, 9],
}
)
derived_df = df[["A", "C", "E"]]
print("The initial DataFrame is:")
print(df, "\n")
print("The DataFrame with A,C and E columns is:")
print(derived_df, "\n")
Output:
The initial DataFrame is:
A B C D E
0 302 100 300 10 4
1 504 300 400 15 5
2 708 400 350 5 6
3 103 200 100 0 7
4 343 400 1000 2 8
5 565 700 400 7 9
The DataFrame with A,C and E columns is:
A C E
0 302 300 4
1 504 400 5
2 708 350 6
3 103 100 7
4 343 1000 8
5 565 400 9
It selects the columns A
, C
, and E
from the DataFrame df
and assigns these columns to the derived_df
DataFrame.
Select Columnns From a Pandas DataFrame Using the DataFrame.drop()
Method
import pandas as pd
df = pd.DataFrame(
{
"A": [302, 504, 708, 103, 343, 565],
"B": [100, 300, 400, 200, 400, 700],
"C": [300, 400, 350, 100, 1000, 400],
"D": [10, 15, 5, 0, 2, 7],
"E": [4, 5, 6, 7, 8, 9],
}
)
derived_df = df.drop(["B", "D"], axis=1)
print("The initial DataFrame is:")
print(df, "\n")
print("The DataFrame with A,C and E columns is:")
print(derived_df, "\n")
Output:
The initial DataFrame is:
A B C D E
0 302 100 300 10 4
1 504 300 400 15 5
2 708 400 350 5 6
3 103 200 100 0 7
4 343 400 1000 2 8
5 565 700 400 7 9
The DataFrame with A,C and E columns is:
A C E
0 302 300 4
1 504 400 5
2 708 350 6
3 103 100 7
4 343 1000 8
5 565 400 9
It drops the columns B
and D
from the DataFrame df
and assigns the remaining columns to the derived_df
. Alternatively, it selects all the columns except B
and D
and assigns them to the derived_df
DataFrame.
Select Columns From a Pandas DataFrame Using the DataFrame.filter()
Method
import pandas as pd
df = pd.DataFrame(
{
"A": [302, 504, 708, 103, 343, 565],
"B": [100, 300, 400, 200, 400, 700],
"C": [300, 400, 350, 100, 1000, 400],
"D": [10, 15, 5, 0, 2, 7],
"E": [4, 5, 6, 7, 8, 9],
}
)
derived_df = df.filter(["A", "C", "E"])
print("The initial DataFrame is:")
print(df, "\n")
print("The DataFrame with A,C and E columns is:")
print(derived_df, "\n")
Output:
The initial DataFrame is:
A B C D E
0 302 100 300 10 4
1 504 300 400 15 5
2 708 400 350 5 6
3 103 200 100 0 7
4 343 400 1000 2 8
5 565 700 400 7 9
The DataFrame with A,C and E columns is:
A C E
0 302 300 4
1 504 400 5
2 708 350 6
3 103 100 7
4 343 1000 8
5 565 400 9
It extracts or filters the columns A
, C
, and E
from the DataFrame df
and assigns it to the DataFrame derived_df
.
Suraj Joshi is a backend software engineer at Matrice.ai.
LinkedIn