Arushi Arushi - 6 months ago 8
Python Question

Identify runs of repeated list items

I have a sequence of items in list. I want to identify runs of identical elements and print their starting and ending locations. For example, with:

content=[c,c,c,c,f,f,f,f,c,c,b,b,b,b...]


I want the output to be like:

1-4 c
5-8 f
9-10 c


and so on and so forth. Here's what I have so far:

x=len(content)-1
i=0
y=0
z=0
for i in range(0,x):
if(content[i]==content[i+1]):
y=y+1
z=i-1
else:
print y
print content[z]

Answer

The first is not your if and else, but rather how you're looping. You seem to be missing a call to range, and only have the call's arguments. Try:

for i in range(0, x):

The if and else blocks will now be reached, and you just need to adjust them to track the values you care about. If you want the start and and of consecutive runs of items, you don't actually need both parts. Try this:

run_start = 0
for i in range(len(content)-1):
    if content[i] != content[i+1]: # only one branch needed, nothing to do when items are ==
        print "{}-{} {}".format(run_start+1, i+1, content[i])
        run_start = i+1
print "{}-{} {}".format(run_start+1, len(content), content[-1]) # extra code for the last run

This will print out ranges like 3-3 if there's only one item in a run. If you don't want that, you may need to add another if statement to check that i and run_start are not equal (and either print something else, or skip that run if they are).

I find that it's very helpful to you use meaningful variable names, where possible. In this case, I'm using run_start rather than y or x for that reason.