Nico Schlömer - 1 year ago 254

Python Question

I have a (large) list of lists of integers, e.g.,

`a = [`

[1, 2],

[3, 6],

[2, 1],

[3, 5],

[3, 6]

]

Most of the pairs will appear twice, where the order of the integers doesn't matter (i.e.,

`[1, 2]`

`[2, 1]`

`b = [False, False, False, True, False]`

Since

`a`

`frozenset`

Answer Source

Here's a numpythonic solution:

```
a = numpy.array(a)
a.sort(axis=1)
b = numpy.ascontiguousarray(a).view(
numpy.dtype((numpy.void, a.dtype.itemsize * a.shape[1]))
)
_, inv, ct = numpy.unique(b, return_inverse=True, return_counts=True)
print(ct[inv] == 1)
```

Sorting is fast and makes sure that the edges

`[i, j]`

,`[j, i]`

in the original array identify with each other. Much faster than`frozenset`

s or`tuple`

s.Row uniquification inspired by http://stackoverflow.com/a/16973510/353337.

For a sample size of 1e6, this is about 10 times faster than the `Counter`

solution.