Python list comprehension : Learn by Examples

This tutorial covers how list comprehension works in Python. It includes many examples which would help you to familiarize the concept and you should be able to implement it in your live project at the end of this lesson.

Table of Contents

What is list comprehension?

Python is an object oriented programming language. Almost everything in them is treated consistently as an object. Python also features functional programming which is very similar to mathematical way of approaching problem where you assign inputs in a function and you get the same output with same input value. Given a function f(x) = x², f(x) will always return the same result with the same x value. The function has no "side-effect" which means an operation has no effect on a variable/object that is outside the intended usage. "Side-effect" refers to leaks in your code which can modify a mutable data structure or variable.

Functional programming is also good for parallel computing as there is no shared data or access to the same variable.

List comprehension is a part of functional programming which provides a crisp way to create lists without writing a for loop.

In the image above, the for clause iterates through each item of list. if clause filters list and returns only those items where filter condition meets. if clause is optional so you can ignore it if you don't have conditional statement.

[i**3 for i in [1,2,3,4] if i>2] means take item one by one from list [1,2,3,4] iteratively and then check if it is greater than 2. If yes, it takes cube of it. Otherwise ignore the value if it is less than or equal to 2. Later it creates a list of cube of values 3 and 4. Output : [27, 64]

List Comprehension vs. For Loop vs. Lambda + map()

All these three have different programming styles of iterating through each element of list but they serve the same purpose or return the same output. There are some differences between them as shown below.

1. List comprehension is more readable than For Loop and Lambda function.

List Comprehension

[i**2 for i in range(2,10)]

For Loop

sqr = [] 
for i in range(2,10):
    sqr.append(i**2)
sqr

Lambda + Map

list(map(lambda i: i**2, range(2, 10)))

Output
[4, 9, 16, 25, 36, 49, 64, 81]

List comprehension is performing a loop operation and then combines items to a list in just a single line of code. It is more understandable and clearer than for loop and lambda.

range(2,10) returns 2 through 9 (excluding 10).

**2 refers to square (number raised to power of 2). sqr = [] creates empty list. append( ) function stores output of each repetition of sequence (i.e. square value) in for loop.

map( ) applies the lambda function to each item of iterable (list). Wrap it in list( ) to generate list as output

2. List comprehension is slightly faster than For Loop and Lambda function.

Here we are measuring the code execution time of these 3 methods. We are calculating square of values from 1 through 10 millions one by one only when a value is an even number (divisible by 2).
If statement if x%2 == 0 checks whether a number is even or not. For example, 5 % 2 returns 1 which is remainder. If remainder is 0, it means it is an even number.

List Comprehension

l1= [x**2 for x in range(1, 10**7) if x % 2 == 0]

# Processing Time : 3.96 seconds

For Loop

sqr = [] 
for x in range(1, 10**7):
    if x%2 == 0:
        sqr.append(x**2)

# Processing Time : 5.46 seconds

Lambda + Map()

l0 = list(map(lambda x: x**2, filter(lambda x: x%2 == 0, range(1, 10**7))))

# Processing Time : 5.32 seconds

filter(lambda x: x%2 == 0, range(1, 10**7)) returns even numbers from 1 through (10 raised to power 7) as filter() function is used to subset items from the list. Roughly you can think of filter() as WHERE clause of SQL.

List Comprehension : IF-ELSE

Here we are telling python to convert text of each item of list to uppercase letters if length of string is greater than 4. Otherwise, convert text to lowercase. upper( ) converts string to uppercase. Similarly, you can use lower( ) function for transforming string to lowercase.

List Comprehension

mylist = ['Dave', 'Micheal', 'Deeps']
[x.upper() if len(x)>4 else x.lower() for x in mylist]

For Loop

k = [] 
for x in mylist:
    if len(x) > 4:
        k.append(x.upper())
    else:
        k.append(x.lower())
k

Filtering Dictionary using List Comprehension

Suppose you have a dictionary and you want to select particular keys and specific values. In short you want to apply conditional statement IF to subset or filter dictionary.

d = {'a': [1,2,1], 'b': [3,4,1], 'c': [5,6,2]}

Filter dictionary by values

Here we are selecting values where b is greater than 2.

[x for x in d['b'] if x >2]

Output
[(3, 4)]

Filter dictionary by multiple conditions

Here we are applying condition - select only those values where a is equal to 1 and b is greater than 1.

[(x,y) for x, y in zip(d['a'],d['b']) if x == 1 and y > 1]

Output
[(1, 3)]

In the above program, x refers to d['a'] and y refers to d['b'].

Filter dictionary where all values are greater than 1

all( ) checks condition(s) for all values in the list. It returns True/False.

k refers to keys and v refers to values of dictionary.

[(k,v) for k,v in d.items() if all(x > 1 for x in v) ]

Output
[('c', [5, 6, 2])]

Only key c contains values which all are greater than 1. Similarly we have any( ) function which returns True if any of the value meets condition.

[(k,v) for k,v in d.items() if any(x > 2 for x in v) ]

Output
[('b', [3, 4, 1]), ('c', [5, 6, 2])]

Using List Comprehension on Pandas DataFrame

In real-world, we generally have data stored in either CSV or relational databases. We generally convert it to pandas dataframe and then we do data cleaning and manipulation. Hence it is important to learn how to use list comprehension on dataframe. Suppose you have a dataframe for employees having their names and current and previous salary. Let's create a pandas dataframe for illustration.

import pandas as pd
df = pd.DataFrame({'name': ['Sandy', 'Sam', 'Wright', 'Atul'], 
        'prevsalary': [71, 65, 64, 90],                   
        'nextsalary': [75, 80, 61, 89]})
df

     name  prevsalary  nextsalary
0   Sandy          71          75
1     Sam          65          80
2  Wright          64          61
3    Atul          90          89

We need to create a column called 'Flag' which will have value either 'High Bracket' or 'Low Bracket'. If values of column prevsalary is greater than 70, it should be labelled as 'High Bracket'. Else assign it as 'Low Bracket'.

df['Flag'] = ["High Bracket" if x > 70 else "Low Bracket" for x in df['prevsalary']]

The above line of code produced a new column in the df dataframe.

     name  prevsalary  nextsalary          Flag
0   Sandy          71          75  High Bracket
1     Sam          65          80   Low Bracket
2  Wright          64          61   Low Bracket
3    Atul          90          89  High Bracket

How to perform operation on multiple columns?

Suppose you want to compare two variables (columns) prevsalary and nextsalary. If prevsalary > nextsalary, categorize it as "Increase". Else populate "Decrease" value.

df['Flag2'] = ["Increase" if x > y  else "Decrease" for (x,y) in zip(df['nextsalary'],df['prevsalary'])]

     name  prevsalary  nextsalary          Flag     Flag2
0   Sandy          71          75  High Bracket  Increase
1     Sam          65          80   Low Bracket  Increase
2  Wright          64          61   Low Bracket  Decrease
3    Atul          90          89  High Bracket  Decrease

Here we are comparing two variables x and y in list comprehension
x = df['nextsalary'], y = df['prevsalary']

Convert character variable to integer

Suppose you have a column which contains numeric values but the column is defined as a character variable (object column type). You need to convert it to integer. It is a common problem when you read data from some online portal or some columns of table are stored in char or varchar format in relational database. Let's create a dataframe for example.

import pandas as pd
df = pd.DataFrame({"col1" : ['1', '2', '3']})

df['col2'] = [int(x) for x in df['col1']]

See the output below. col2 has column type integer.

col1    object
col2     int64

This can be solved without list comprehension as pandas has built-in function for the same df['col2'] = df['col1'].astype(int)

Nested List Comprehension

It is equivalent to multiple for-loop. The standard syntax of nested list comprehension is shown below

Suppose you have a list of lists and you want to select only the odd numbers. Any number not divisible by 2 is an odd number.

List Comprehension

mat = [[1,2], [3,4], [5,6]]
[x for row in mat for x in row if x%2==1]

Multiple For-Loop

b = []
for row in mat:
    for x in row:
        if x%2 == 1:
            b.append(x)

b

If you observe the syntax above and compare them, you would find them very similar.

Output
[1, 3, 5]

List Comprehension on dictionaries within list

Suppose you have a list which contains multiple dictionaries and you want to apply filtering on values and selecting specific keys.

mylist = [{'a': 1, 'b': 2}, {'a': 3, 'b': 4}, {'a': 5, 'b': 6}]

In the following code, we are selecting only key a and all the values against this key where it is greater than 1.

[i['a'] for i in mylist if 'a' in i if i['a'] > 1]

Output
[3, 5]

How to create tuples from lists

Idea is to prepare all the possible combinations by applying multiple loops with list comprehension.

l1 = ['a','b']
l2 = ['c','d']
[(x,y) for x in l1 for y in l2]

Output
[('a', 'c'), ('a', 'd'), ('b', 'c'), ('b', 'd')]

Split Sentences into words with list comprehension

In text mining, it is one of the initial data cleaning step to break sentences into words. Before splitting, make sure all words are converted into either lower or uppercase to avoid multiple words for same content.

text = ["Life is beautiful", "No need to overthink", "Meditation help in overcoming depression"]

[word for sentence in text for word in sentence.lower().split(' ')]

['life',
 'is',
 'beautiful',
 'no',
 'need',
 'to',
 'overthink',
 'meditation',
 'help',
 'in',
 'overcoming',
 'depression']

How it works?

for sentence in text means looping through each sentence of list text
text[0].lower().split(' ') converts to lowercase and then separate sentence into words and returns ['life', 'is', 'beautiful']

Exercises for practice

The following are the exercises which would help you to gain hands-on experience. Solve and paste your solution in the comment box below.

1. Remove these words is, in, to, no from text list. Desired output should be as follows -

'life',
 'beautiful',
 'need',
 'overthink',
 'meditation',
 'help',
 'overcoming',
 'depression'

2. Find matching numbers in the lists.

x = [51, 24, 32, 41]
y = [42, 32, 41, 50]

Output should be [32, 41]

3. If element of list is between 30 and 45, make it 1 else 0

x = [51, 24, 32, 41]

Output : [0, 0, 1, 1]

About Author:
Deepanshu Bhalla

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 10 years of experience in data science. During his tenure, he worked with global clients in various domains like Banking, Insurance, Private Equity, Telecom and HR.

While I love having friends who agree, I only learn from those who don't
Let's Get Connected Email LinkedIn