Pandas DataFrame DataFrame.query() Function
-
Syntax of
pandas.DataFrame.query(): -
Example Codes:
DataFrame.query()Method With Single Condition -
Example Codes:
DataFrame.query()Method When Column Name Has Whitespace -
Example Codes:
DataFrame.query()Method With Multiple Conditions
pandas.DataFrame.query() method filters the rows of the caller DataFrame using the given query expression.
Syntax of pandas.DataFrame.query():
DataFrame.query(expr, inplace=False, **kwargs)
Parameters
expr |
Query expression based upon which rows are filtered |
inplace |
Boolean. If True, modify the caller DataFrame in-place |
**kwargs |
Keyword arguments for the method |
Return
If inplace is True, it returns the filtered DataFrame; otherwise None.
Example Codes: DataFrame.query() Method With Single Condition
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
'Y': [4, 1, 8]})
print("Original DataFrame:")
print(df)
filtered_df=df.query('X>1')
print("Filtered DataFrame:")
print(filtered_df)
Output:
Original DataFrame:
X Y
0 1 4
1 2 1
2 3 8
Filtered DataFrame:
X Y
1 2 1
2 3 8
It returns the DataFrame with only the rows that satisfy the given query expression i.e. only the rows whose value in column X is greater than 1.
Example Codes: DataFrame.query() Method When Column Name Has Whitespace
We must make sure column names to be queried do not have any white spaces before applying this method to DataFrame.
If we have column names with spaces in them, we could use backtick quoting (`).
import pandas as pd
df = pd.DataFrame(
{
"X": [
1,
2,
3,
],
"Y": [4, 1, 8],
"A B": [3, 5, 7],
}
)
print("Original DataFrame:")
print(df)
filtered_df = df.query("`A B`>5")
print("Filtered DataFrame:")
print(filtered_df)
Output:
Original DataFrame:
X Y A B
0 1 4 3
1 2 1 5
2 3 8 7
Filtered DataFrame:
X Y A B
2 3 8 7
Here, the column A B has space in its name. To make query expression for the column, we enclose the column name in backticks; otherwise, it will raise an error.
Example Codes: DataFrame.query() Method With Multiple Conditions
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
'Y': [4, 1, 8]})
print("Original DataFrame:")
print(df)
filtered_df=df.query('X>1' and 'Y==1')
print("Filtered DataFrame:")
print(filtered_df)
Output:
Original DataFrame:
X Y
0 1 4
1 2 1
2 3 8
Filtered DataFrame:
X Y
1 2 1
If we wish to filter DataFrame based on multiple conditions, we combine the multiple query expressions using and operator to make a single composite query expression.
It gives the DataFrame with rows whose value of column X is greater than 1, and value of column Y equals 1.
We can modify the original DataFrame after calling query() method by setting inplace=True.
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
'Y': [4, 1, 8]})
filtered_df=df.query('X>1' and 'Y==1',inplace=True)
print(df)
Output:
X Y
1 2 1
Suraj Joshi is a backend software engineer at Matrice.ai.
LinkedIn