Nico Schlömer Nico Schlömer - 4 months ago 105
Python Question

Find unique pairs in list of pairs

I have a (large) list of lists of integers, e.g.,



a = [
[1, 2],
[3, 6],
[2, 1],
[3, 5],
[3, 6]
]


Most of the pairs will appear twice, where the order of the integers doesn't matter (i.e.,
[1, 2]
is equivalent to
[2, 1]
). I'd now like to find the pairs that appear only once, and get a Boolean list indicating that. For the above example,



b = [False, False, False, True, False]


Since
a
is typically large, I'd like to avoid explicit loops. Mapping to
frozenset
s may be advised, but I'm not sure if that's overkill.

Answer

Here's a numpythonic solution:

a = numpy.array(a)
a.sort(axis=1)
b = numpy.ascontiguousarray(a).view(
        numpy.dtype((numpy.void, a.dtype.itemsize * a.shape[1]))
        )
_, inv, ct = numpy.unique(b, return_inverse=True, return_counts=True)
print(ct[inv] == 1)
  • Sorting is fast and makes sure that the edges [i, j], [j, i] in the original array identify with each other. Much faster than frozensets or tuples.

  • Row uniquification inspired by http://stackoverflow.com/a/16973510/353337.

For a sample size of 1e6, this is about 10 times faster than the Counter solution.

Comments