jkokorian jkokorian - 2 months ago 7
Python Question

How does the list comprehension to flatten a python list work?

I recently looked for a way to flatten a nested python list, like this: [[1,2,3],[4,5,6]], into this: [1,2,3,4,5,6].

Stackoverflow was helpful as ever and I found a post with this ingenious list comprehension:

l = [[1,2,3],[4,5,6]]
flattened_l = [item for sublist in l for item in sublist]


I thought I understood how list comprehensions work, but apparently I haven't got the faintest idea. What puzzles me most is that besides the comprehension above, this also runs (although it doesn't give the same result):

exactly_the_same_as_l = [item for item in sublist for sublist in l]


Can someone explain how python interprets these things? Based on the second comprension, I would expect that python interprets it back to front, but apparently that is not always the case. If it were, the first comprehension should throw an error, because 'sublist' does not exist. My mind is completely warped, help!

Answer

Let's take a look at your list comprehension then, but first let's start with list comprehension at it's easiest.

l = [1,2,3,4,5]
print [x for x in l] # prints [1, 2, 3, 4, 5]

You can look at this the same as a for loop structured like so:

for x in l:
    print x

Now let's look at another one:

l = [1,2,3,4,5]
a = [x for x in l if x % 2 == 0]
print a # prints [2,4]

That is the exact same as this:

a = []
l = [1,2,3,4,5]
for x in l:
    if x % 2 == 0:
        a.append(x)
print a # prints [2,4]

Now let's take a look at the examples you provided.

l = [[1,2,3],[4,5,6]]
flattened_l = [item for sublist in l for item in sublist]
print flattened_l # prints [1,2,3,4,5,6]

For list comprehension start at the farthest to the left for loop and work your way in. The variable, item, in this case, is what will be added. It will produce this equivalent:

l = [[1,2,3],[4,5,6]]
flattened_l = []
for sublist in l:
    for item in sublist:
        flattened_l.append(item)

Now for the last one

exactly_the_same_as_l = [item for item in sublist for sublist in l]

Using the same knowledge we can create a for loop and see how it would behaveL

for item in sublist:
    for sublist in l:
        exactly_the_same_as_l.append(item)

Now the only reason the above one works is because when flattened_l was created, it also created sublist. It is a scoping reason to why that did not throw an error. If you ran that without defining the flattened_l first, you would get a NameError