How to Query Pandas DataFrame?

Pandas DataFrame – Query based on Columns

To query DataFrame rows based on a condition applied on columns, you can use pandas.DataFrame.query() method.

By default, query() function returns a DataFrame containing the filtered rows. You can also pass inplace=True argument to the function, to modify the original DataFrame.

Examples

1. Query DataFrame with condition on a column

In this example, we will query the DataFrame to return filtered DataFrame with rows that satisfy the passed boolean expression.

Python Program

import pandas as pd

# Initialize a dataframe
df = pd.DataFrame(
	[[21, 72, 67],
	[23, 78, 62],
	[32, 74, 56],
	[73, 88, 67],
	[32, 74, 56],
	[43, 78, 69],
	[32, 74, 54],
	[52, 54, 76]],
	columns=['a', 'b', 'c'])

# Query single column
df1 = df.query('a>50')

# Print the dataframe
print(df1)
Run Code Copy

Output

    a   b   c
3  73  88  67
7  52  54  76

2. Query DataFrame with condition on multiple columns using AND operator

In this example, we will try to apply the condition on multiple columns and use AND operator.

Python Program

import pandas as pd

# Initialize a dataframe
df = pd.DataFrame(
	[[21, 72, 67],
	[23, 78, 62],
	[32, 74, 56],
	[73, 88, 67],
	[32, 74, 56],
	[43, 78, 69],
	[32, 74, 54],
	[52, 54, 76]],
	columns=['a', 'b', 'c'])

# Query multiple columns
df1 = df.query('a>30 and c>60')

# Print the dataframe
print(df1)
Run Code Copy

Output

    a   b   c
3  73  88  67
5  43  78  69
7  52  54  76

3. Query DataFrame with condition on multiple columns using OR operator

In this example, we will try to apply the condition on multiple columns and use OR operator.

Python Program

import pandas as pd

# Initialize a dataframe
df = pd.DataFrame(
	[[21, 72, 67],
	[23, 78, 62],
	[32, 74, 56],
	[73, 88, 67],
	[32, 74, 56],
	[43, 78, 69],
	[32, 74, 54],
	[52, 54, 76]],
	columns=['a', 'b', 'c'])

# Query multiple columns
df1 = df.query('a>50 or c>60')

# Print the dataframe
print(df1)
Run Code Copy

Output

    a   b   c
0  21  72  67
1  23  78  62
3  73  88  67
5  43  78  69
7  52  54  76

4. Query DataFrame with inplace parameter

We can pass inplace=True, to modify the actual DataFrame we are working on.

Python Program

import pandas as pd

# Initialize a dataframe
df = pd.DataFrame(
	[[21, 72, 67],
	[23, 78, 62],
	[32, 74, 56],
	[73, 88, 67],
	[32, 74, 56],
	[43, 78, 69],
	[32, 74, 54],
	[52, 54, 76]],
	columns=['a', 'b', 'c'])

# Query dataframe with inplace trues
df.query('a>50 and c>60', inplace=True)

# Print the dataframe
print(df)
Run Code Copy

Output

    a   b   c
3  73  88  67
7  52  54  76

Summary

In this Pandas Tutorial, we learned how to query a DataFrame with conditions applied on columns.

Related Tutorials

Code copied to clipboard successfully 👍