Chapter 10 Comprehensions
Now that we’ve seen iterations in action, let’s take a look at how list and dict comprehensions make our lives easier.
10.1 List comprehensions
List comprehensions are used to create lists from other lists, DataFrame columns and other data containers. Comprehensions are useful and common as they allow you to rapidly iterate over sets of objects. They allow us to perform complex operations in a single line of code.
For example, this for loop is pretty tedious:
= [6, 8, 4, 2, 5, 6, 7, 3, 5]
vals vals
## [6, 8, 4, 2, 5, 6, 7, 3, 5]
= []
new_vals
for num in vals:
+ 10)
new_vals.append(num
new_vals
## [16, 18, 14, 12, 15, 16, 17, 13, 15]
A much easier way is to use a list comprehension:
= [num + 10 for num in vals]
new_vals2 new_vals2
## [16, 18, 14, 12, 15, 16, 17, 13, 15]
Can you see the direct relationship between the above list comprehension and the for loop?
Well, to be honest, in this case you would just use a NumPy
array.
= np.array([6, 8, 4, 2, 5, 6, 7, 3, 5])
vals vals
## array([6, 8, 4, 2, 5, 6, 7, 3, 5])
+ 10 vals
## array([16, 18, 14, 12, 15, 16, 17, 13, 15])
But that will not work for everything you want to do, since comprehensions are not just for lists, it works over any iterable. Remember that a range object is iterable:
# like a range object
+ 10 for num in range(10)] [num
## [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
where I don’t have to convert to a NumPy
array first. To perform list comprehensions, we need:
- An iterable
- An iterator variable to represent the members of the iterable
- The output expression
List comprehensions use the following syntax [[output expression] for iterator variable in iterable]
.
This also works for nested for loops. Take a look at this typical example:
= []
pairs_1
for num1 in range(0,2):
for num2 in range(6,8):
pairs_1.append((num1, num2))
pairs_1
## [(0, 6), (0, 7), (1, 6), (1, 7)]
It’s better as list comprehensions:
= [(num1, num2) for num1 in range(0,2) for num2 in range(6,8)]
pairs_2 pairs_2
## [(0, 6), (0, 7), (1, 6), (1, 7)]
It is a little less readable at first, but once you get the hang of it, it’s a nice syntax. For example:
**2 for i in range(0,10)] [i
## [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Let’s look at a more interesting example, with a matrix. Remember a matrix is just a 2-dimenaional NumPy
array. In Python a matrix is represented as a list of lists, all having the same type:
= [[0, 1, 2, 3, 4],
matrix 0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]] [
Can we produce this using nested list comprehensions?
You can create one of the rows of the matrix with a single list comprehension, then, to create the list of lists, you simply have to supply the list comprehension as the output expression of the overall list comprehension.
That is, the output expression, as we see in the generic syntax [[output expression] for iterator variable in iterable]
, is itself a list comprehension.
Here’s the nested for loop solution:
= []
matrix
for i in range(5):
# Append an empty sublist inside the list
matrix.append([])
for j in range(5):
matrix[i].append(j)
matrix
## [[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]
And as a list comprehension:
# Nested list comprehension
= [[j for j in range(5)] for i in range(5)]
matrix # matrix = [[col for col in range(5)] for row in range(5)]
print(matrix)
## [[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]
So that is certainly cleaner code, and you can appreciate why list comprehensions are so popular.
10.2 Using conditionals
In the chapter on Logical Expressions, we saw how to use relational and logical operators to result in Boolean objects. Here, we can use them to subset lists.
[
output expression for
iterator variable in
iterable if
predicate expression ]
For example:
** 2 for num in range(10) if num % 2 == 0] [num
## [0, 4, 16, 36, 64]
Get only long names:
cities
list, below, extract only those cities with long names, over 6 characters long.
cities = ['Munich', 'Paris', 'Amsterdam', 'Madrid', 'Istanbul']
As an example, recall the modulo operator to check if a value is even? We can include a conditional on the output:
# Square all the values and return only the even ones.
** 2 if num % 2 == 0 else 0 for num in range(10)] [num
## [0, 0, 4, 0, 16, 0, 36, 0, 64, 0]
10.3 Dictionary comprehensions
In addition to lists, we can also use dictionary comprehensions. There are two key differences:
- Use
{}
, not[]
. - The key and value are separated by a
:
in the output expressions
= {num: -num for num in range(10)}
pos_neg pos_neg
## {0: 0, 1: -1, 2: -2, 3: -3, 4: -4, 5: -5, 6: -6, 7: -7, 8: -8, 9: -9}
= {num: num**2 for num in range(10)}
pos_neg pos_neg
## {0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}
= ['Munich', 'Paris', 'Amsterdam', 'Madrid', 'Istanbul']
cities len(names) for names in cities} {names:
## {'Munich': 6, 'Paris': 5, 'Amsterdam': 9, 'Madrid': 6, 'Istanbul': 8}
10.3.1 Generators
Generators are like comprehensions, except that they don’t store the solution in memory. This is practical when working on large data sets. The difference is in notation, just use ()
notation instead of []
.
10.4 Wrap-up
So far in our journey in iterations we’ve seen:
- Iterators
- Associated iteration functions
- Generators (very briefly)
- List and dict comprehensions