Trees4theForest - 1 year ago 47

Python Question

Consider this list:

`dates = [`

('2015-02-03', 'name1'),

('2015-02-04', 'nameg'),

('2015-02-04', 'name5'),

('2015-02-05', 'nameh'),

('1929-03-12', 'name4'),

('2023-07-01', 'name7'),

('2015-02-07', 'name0'),

('2015-02-08', 'nameh'),

('2015-02-15', 'namex'),

('2015-02-09', 'namew'),

('1980-12-23', 'name2'),

('2015-02-12', 'namen'),

('2015-02-13', 'named'),

]

How can I identify those dates that are out of sequence. I don't care if they repeat, or skip, I just need the ones way out of line. Ie, I should get back:

`('1929-03-12', 'name4'),`

('2023-07-01', 'name7'),

('2015-02-15', 'namex'),

('1980-12-23', 'name2'),

Namex is less obvious, but it's not in the general order of the list.

My simplistic start (which I have deleted to simplify the question) is obviously woefully incomplete.

- http://stackoverflow.com/a/9832414/1061836
- How to determine the longest increasing subsequence using dynamic programming?
- https://rosettacode.org/wiki/Longest_increasing_subsequence#Python
- http://codereview.stackexchange.com/questions/10230/python-implementation-of-the-longest-increasing-subsequence

Seems once I get the LIS, I can compare it to the original list and see where the gaps are... Fascinating. SO is the hive-mind of awesomeness.

Answer Source

Based on the question at Code Review and a question about non-decreasing sequences (since that's what you're after), here's a solution to your problem:

```
from bisect import bisect_right
from operator import itemgetter
def out_of_sequence(seq, key = None):
if key is None: key = lambda x: x
lastoflength = [0] # end position of subsequence with given length
predecessor = [None] # penultimate element of l.i.s. ending at given position
for i in range(1, len(seq)):
# find length j of subsequence that seq[i] can extend
j = bisect_right([key(seq[k]) for k in lastoflength], key(seq[i]))
# update old subsequence or extend the longest
try: lastoflength[j] = i
except: lastoflength.append(i)
# record element preceding seq[i] in the subsequence for backtracking
predecessor.append(lastoflength[j-1] if j > 0 else None)
indices = set()
i = lastoflength[-1]
while i is not None:
indices.add(i)
i = predecessor[i]
return [e for i, e in enumerate(seq) if i not in indices]
print(*out_of_sequence(dates, itemgetter(0)), sep='\n')
```

Outputs:

```
('1929-03-12', 'name4')
('2023-07-01', 'name7')
('2015-02-15', 'namex')
('1980-12-23', 'name2')
```