Is there a built-in that removes duplicates from list in Python, whilst preserving order? I know that I can use a set to remove duplicates, but that destroys the original order. I also know that I can roll my own like this:
output = 
for x in input:
if x not in output:
Here you have some alternatives: http://www.peterbe.com/plog/uniqifiers-benchmark
def f7(seq): seen = set() seen_add = seen.add return [x for x in seq if not (x in seen or seen_add(x))]
seen_add instead of just calling
seen.add? Python is a dynamic language, and resolving
seen.add each iteration is more costly than resolving a local variable.
seen.add could have changed between iterations, and the runtime isn't smart enough to rule that out. To play it safe, it has to check the object each time.
If you plan on using this function a lot on the same dataset, perhaps you would be better off with an ordered set: http://code.activestate.com/recipes/528878/
O(1) insertion, deletion and member-check per operation.