hakabuk hakabuk - 2 months ago 8
Python Question

Finding craters [descending/ ascending ints] in a python list of ints

I have a list of ints in python, and I would like the output to be a list of different "craters" in the sequence. I define a crater as a sequence of numbers that begins with descending ints that gradually grow bigger, until they reach a number that is equal to the first int of the crater or greater than the first int in the crater.
If the last number is equal or smaller than the first - it should be in the output at the end of the crater. If the last number is greater than the first number in the crater - it should not be appended to the end of the output list.

For example, for the list:
1,2,3,4,5,6,5,4,3,4,5,6,7,8,9
A "crater" would be:
6,5,4,3,4,5,6

My code:

def find_crater(my_list):
cur_crater = []
for i in range(len(my_list)-1):
#check if the numbers are decreasing
if (my_list[i] - my_list[i+1] > 0):
cur_crater.append(my_list[i])
#if increasing - check if we're in a crater
#and if current number is smaller than begining of crater
elif len(cur_crater) > 1 and my_list[i] <= cur_crater[0]:
cur_crater.append(my_list[i])
#print crater and empty list
elif len(cur_crater) > 1 and my_list[i+1] > cur_crater[0]:
print(cur_crater)
cur_crater = []
#continue if we're out of a crater
else:
continue
#catching the last crater in the list
if len(cur_crater) > 1:
#include last int
if my_list[i] - my_list[-1] < 0 and my_list[-1] <= cur_crater[0]:
cur_crater.append(my_list[-1])
print(cur_crater)
return


I have called the function on 2 lists. The 1st output is as expected:

ok_list = [12, 4, 7, 4, 2, 4, 5, 6, 5, 12, 21, 23,
24, 26, 25, 22, 10, 9, 8, 6, 9, 10, 11, 13, 12, 14,
17, 27, 28, 19, 17, 19, 21, 19, 12, 23, 25, 27]


Ok list output [edit: not so ok - it ommited the first '4' so turns out it's not so OK after all]:

[12, 7, 4, 2, 4, 5, 6, 5, 12]
[26, 25, 22, 10, 9, 8, 6, 9, 10, 11, 13, 12, 14, 17]
[28, 19, 17, 19, 21, 19, 12, 23, 25, 27]


However, the second list combines 2 lists of craters into 1 list in the output:

problematic = [12, 4, 7, 4, 2, 4, 5, 6, 5, 12, 21, 23,
24, 26, 25, 22, 10, 9, 8, 6, 9, 10, 11, 13, 12, 14,
17, 27, 19, 25, 19, 12, 23, 25, 27]


problematic Output:

[12, 7, 4, 2, 4, 5, 6, 5, 12]
[26, 25, 22, 10, 9, 8, 6, 9, 10, 11, 13, 12, 14, 17, 27, 19, 25, 19, 12, 23, 25]


I was expecting:

[12, 4, 7, 4, 2, 4, 5, 6, 5, 12]
[26, 25, 22, 10, 9, 8, 6, 9, 10, 11, 13, 12, 14, 17]
[27, 19, 25, 19, 12, 23, 25, 27]


I tried to take care of this by writing:
my_list[i+1] > cur_crater[0]
and thus checking the size of next int in the list - but that does not seem to fix it [I left it in the code even though it does not do the trick in hopes of someone explaining why is that wrong/ not working?].

In conclusion, my code can't handle it when there is a crater right after a crater with only one int in between.

If the input is:

[12, 4, 7, 4, 2, 4, 5, 6, 5, 12, 21, 10, 9, 8, 6, 9, 10]

Then the output is one long list, but I'd like to split it after list item 12 and before list item 21, but the output is one long list.

I would love to get info about any library or method that can give me more ideas regarding a better and more efficient solution.

Answer

Presuming your expected output is wrong which it seems to be, all you want to do is go until you find an element >= to the start of the chain and catch chains that only contain a single element:

def craters(lst):
    it = iter(lst)
    # get start of first chain
    first = next(it)
    tmp = [first]
    # iterate over the remaining
    for ele in it:
        # >= ends chain
        if ele >= first:
            # if equal, add it and call
            #  next(it) to set the next first element.
            if ele == first:
                tmp.append(ele)
                yield tmp
                first = next(it)
                tmp = [first]
            # else yield the group if it has > 1 element.
            # if it has 1 element it is not a descending start sequecne
            else:
                if len(tmp) > 1:
                    yield tmp
                tmp = [ele]
                first = ele
        else:
            tmp.append(ele)
    yield tmp

Output:

In [5]: ok_list = [12, 4, 7, 4, 2, 4, 5, 6, 5, 12, 21, 23,
   ...:            24, 26, 25, 22, 10, 9, 8, 6, 9, 10, 11, 13, 12, 14,
   ...:            17, 27, 28, 19, 17, 19, 21, 19, 12, 23, 25, 27]

In [6]: problematic = [12, 4, 7, 4, 2, 4, 5, 6, 5, 12, 21, 23,
   ...: 24, 26, 25, 22, 10, 9, 8, 6, 9, 10, 11, 13, 12, 14,
   ...: 17, 27, 19, 25, 19, 12, 23, 25, 27]

In [7]: ex_ok = [[12, 4, 7, 4, 2, 4, 5, 6, 5, 12],
   ...: [26, 25, 22, 10, 9, 8, 6, 9, 10, 11, 13, 12, 14, 17],
   ...: [28, 19, 17, 19, 21, 19, 12, 23, 25, 27]]

In [8]: ex_p = [[12,4,  7, 4, 2, 4, 5, 6, 5, 12],
   ...: [26, 25, 22, 10, 9, 8, 6, 9, 10, 11, 13, 12, 14, 17],
   ...: [27, 19, 25, 19, 12, 23, 25, 27]]

In [9]: print(list(craters(problematic)) == ex_p)
True

In [10]: print(list(craters(ok_list)) == ex_ok)
True

In [11]: print(list(craters([12, 4, 7, 4, 2, 4, 5, 6, 5, 12, 21, 10, 9, 8, 6, 9, 10])))
[[12, 4, 7, 4, 2, 4, 5, 6, 5, 12], [21, 10, 9, 8, 6, 9, 10]]

That does not involve any slicing/indexing you can lazily return each group.

To simplify your algorithm, the steps are:

  • Start with the first element, then check if the next element is >=, if it is and it is equal add it to the group, if it is > set that to be the new start of the group.

  • If the new first is not greater than the next element the length of that group will be one as it does not satisfy sequence of numbers that begins with descending. So we keep consuming and setting the next element to be the first element of the sequence until we find one the is > the next element, that is what the call to next(it) is doing then we just need to wash and repeat.

Comments