I'd like to convert a list of strings to lowercase and remove duplicates while preserving the order. A lot of the single-line Python magic I've found on StackOverflow converts a list of strings to lowercase, but it seems the order is lost.
I've written the code below which actually works, and I'm happy to stick it with. But I was wondering if there is a way of doing it that is a lot more pythonic and less code (and potentially less buggy if I were to write something similar in the future. This one took me quite a while to write).
""" takes a word list with a special order (e.g. frequency)
returns a new word list all in lower case with no uniques but preserving order"""
# save orders in a dict
orders = dict()
for i in range(len(words)):
wl = words[i].lower()
# save index of first occurence of the word (prioritizing top value)
if wl not in orders:
orders[wl] = i
# contains unique lower case words, but in wrong order
words_unique = list(set(map(str.lower, words)))
# reconstruct sparse list in correct order
words_lower = [''] * len(words)
for w in words_unique:
i = orders[w]
words_lower[i] = w
# remove blank entries
words_lower = [s for s in words_lower if s!='']
Slightly modifying the answer from How do you remove duplicates from a list in Python whilst preserving order?
def f7(seq): seen = set() seen_add = seen.add seq = [x.lower() for x in seq] return [x for x in seq if not (x in seen or seen_add(x))]