How to use variable in a query in pandas

Suppose you want to reference a variable in a query in pandas package in Python. This seems to be a straightforward task but it becomes daunting sometimes. Let's discuss it with examples in the article below.

Let's create a sample dataframe having 3 columns and 4 rows. This dataframe is used for demonstration purpose.

import pandas as pd
df = pd.DataFrame({"col1" : range(1,5), 
                   "col2" : ['A A','B B','A A','B B'],
                   "col3" : ['A A','A A','B B','B B']
                   })
Sample Dataframe
Filter a value A A in column col2
In order to do reference of a variable in query, you need to use @.
Mention Value Explicitly
newdf = df.query("col2 == 'A A'")
Reference Method
myval1 = 'A A'
newdf = df.query("col2 == @myval1")
How to pass column name as a variable in query
Instead of filter value we are referring the column which we want to use for subetting or filtering.
myvar1 = 'col2'
newdf2 = df.query("{0} == 'A A'".format(myvar1))
{0} takes a value of variable myvar1.
"{0} == 'A A'".format(myvar1) returns "col2 == 'A A'"
Incase you want to pass multiple columns as variables in query. Here we are using columns col2 and col3.
myvar1 = 'col2'
myvar2 = 'col3'
newdf2 = df.query("{0} == 'A A' & {1} == 'B B'".format(myvar1, myvar2)) 
"{0} == 'A A' & {1} == 'B B'".format(myvar1, myvar2) is equivalent to "col2 == 'A A' & col3 == 'B B'"
How to handle space in column name
Let's rename column col2 by including a space in between for illustration purpose.
df.rename(columns={'col2':'col 2'}, inplace = True)
By using backticks `` you can pass a column which contains space.
myvar1 = '`col 2`'
newdf = df.query("{0} == 'A A'".format(myvar1))
Related Posts
Spread the Word!
Share
About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.

0 Response to "How to use variable in a query in pandas"

Post a Comment

Next → ← Prev
Ad-Blocker Detected