Pandas DataFrame DataFrame.query() Function
-
Syntax of
pandas.DataFrame.query()
: -
Example Codes:
DataFrame.query()
Method With Single Condition -
Example Codes:
DataFrame.query()
Method When Column Name Has Whitespace -
Example Codes:
DataFrame.query()
Method With Multiple Conditions
pandas.DataFrame.query()
method filters the rows of the caller DataFrame using the given query expression.
Syntax of pandas.DataFrame.query()
:
DataFrame.query(expr, inplace=False, **kwargs)
Parameters
expr |
Query expression based upon which rows are filtered |
inplace |
Boolean. If True , modify the caller DataFrame in-place |
**kwargs |
Keyword arguments for the method |
Return
If inplace
is True
, it returns the filtered DataFrame
; otherwise None
.
Example Codes: DataFrame.query()
Method With Single Condition
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
'Y': [4, 1, 8]})
print("Original DataFrame:")
print(df)
filtered_df=df.query('X>1')
print("Filtered DataFrame:")
print(filtered_df)
Output:
Original DataFrame:
X Y
0 1 4
1 2 1
2 3 8
Filtered DataFrame:
X Y
1 2 1
2 3 8
It returns the DataFrame
with only the rows that satisfy the given query expression i.e. only the rows whose value in column X
is greater than 1.
Example Codes: DataFrame.query()
Method When Column Name Has Whitespace
We must make sure column names to be queried do not have any white spaces before applying this method to DataFrame
.
If we have column names with spaces in them, we could use backtick quoting (`).
import pandas as pd
df = pd.DataFrame(
{
"X": [
1,
2,
3,
],
"Y": [4, 1, 8],
"A B": [3, 5, 7],
}
)
print("Original DataFrame:")
print(df)
filtered_df = df.query("`A B`>5")
print("Filtered DataFrame:")
print(filtered_df)
Output:
Original DataFrame:
X Y A B
0 1 4 3
1 2 1 5
2 3 8 7
Filtered DataFrame:
X Y A B
2 3 8 7
Here, the column A B
has space in its name. To make query expression for the column, we enclose the column name in backticks; otherwise, it will raise an error.
Example Codes: DataFrame.query()
Method With Multiple Conditions
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
'Y': [4, 1, 8]})
print("Original DataFrame:")
print(df)
filtered_df=df.query('X>1' and 'Y==1')
print("Filtered DataFrame:")
print(filtered_df)
Output:
Original DataFrame:
X Y
0 1 4
1 2 1
2 3 8
Filtered DataFrame:
X Y
1 2 1
If we wish to filter DataFrame
based on multiple conditions, we combine the multiple query expressions using and
operator to make a single composite query expression.
It gives the DataFrame
with rows whose value of column X
is greater than 1
, and value of column Y
equals 1
.
We can modify the original DataFrame after calling query()
method by setting inplace=True
.
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
'Y': [4, 1, 8]})
filtered_df=df.query('X>1' and 'Y==1',inplace=True)
print(df)
Output:
X Y
1 2 1
Suraj Joshi is a backend software engineer at Matrice.ai.
LinkedIn