maurobio maurobio - 4 months ago 18
Python Question

"Compressing" a list of integers

I have a list of integers as follows:

my_list = [2,2,2,2,3,4,2,2,4,4,3]

What I want is to have this as a list os strings, indexed and 'compressed', that is, with each element indicated by its position in the list and with each successive duplicate element indicated as a range, like this:

my_new_list = ['0-3,2', '4,3', '5,4', '6-7,2', '8-9,4', '10,3']

EDIT: The expected output should indicate that list elements 0 to 3 have the number 2, element 3, the number 3, element 5, the number 4, elements 6 and 7, the number 2, elements 8 and 9, number 4, and element 10, number 3.

EDIT 2: The output list need not (indeed cannot) be a list of integers, but a list of strings instead.

I could find many examples of finding (and deleting) duplicated elements from lists, but nothing along the lines of what I need.

Could someone point out a relevant example or suggest an algorithm for solving this?

Thanks in advance!


Like most problems involving cascading consecutive duplicates, you can still use groupby() for this. Just group indices by the value at each index.

values = [2,2,2,2,3,4,2,2,4,4,3]
result = []

for key, group in itertools.groupby(range(len(values)), values.__getitem__):
    indices = list(group)

    if len(indices) > 1:
        result.append('{}-{},{}'.format(indices[0], indices[-1], key))
        result.append('{},{}'.format(indices[0], key))



['0-3,2', '4,3', '5,4', '6-7,2', '8-9,4', '10,3']