perimosocordiae - 7 months ago 17
Python Question

Sorting a 2D numpy array by multiple axes

I have a 2D numpy array of shape (N,2) which is holding N points (x and y coordinates). For example:

``````array([[3, 2],
[6, 2],
[3, 6],
[3, 4],
[5, 3]])
``````

I'd like to sort it such that my points are ordered by x-coordinate, and then by y in cases where the x coordinate is the same. So the array above should look like this:

``````array([[3, 2],
[3, 4],
[3, 6],
[5, 3],
[6, 2]])
``````

If this was a normal Python list, I would simply define a comparator to do what I want, but as far as I can tell, numpy's sort function doesn't accept user-defined comparators. Any ideas?

EDIT: Thanks for the ideas! I set up a quick test case with 1000000 random integer points, and benchmarked the ones that I could run (sorry, can't upgrade numpy at the moment).

``````Mine:   4.078 secs
mtrw:   7.046 secs
unutbu: 0.453 secs
``````

Answer

Using lexsort:

``````import numpy as np
a = np.array([(3, 2), (6, 2), (3, 6), (3, 4), (5, 3)])

ind = np.lexsort((a[:,1],a[:,0]))

a[ind]
# array([[3, 2],
#       [3, 4],
#       [3, 6],
#       [5, 3],
#       [6, 2]])
``````

`a.ravel()` returns a view if `a` is `C_CONTIGUOUS`. If that is true, @ars's method, slightly modifed by using `ravel` instead of `flatten`, yields a nice way to sort `a` in-place:

``````a = np.array([(3, 2), (6, 2), (3, 6), (3, 4), (5, 3)])
dt = [('col1', a.dtype),('col2', a.dtype)]
assert a.flags['C_CONTIGUOUS']
b = a.ravel().view(dt)
b.sort(order=['col1','col2'])
``````

Since `b` is a view of `a`, sorting `b` sorts `a` as well:

``````print(a)
# [[3 2]
#  [3 4]
#  [3 6]
#  [5 3]
#  [6 2]]
``````
Source (Stackoverflow)
Comments