Given a matrix A of dimension MxN (4x4), how would one find the nextbest minimum of each 2x2 submatrix?
A = array([[ 32673. , 15108.2 , 26767.2 , 9420. ],
[ 32944.2 , 14604.01 , 26757.01 , 9127.2 ],
[ 26551.2 , 9257.01 , 26595.01 , 9309.2 ],
[ 26624. , 8935.2 , 26673.2 , 8982. ]])
1. Find the minimum in A: 8935.2 global coords[3,1], local coords [1,1]
2. No other matrix has been evaluated so no conflict yet.
3. Find the next submatrix min: 8982. gc [3,3], lc [1,1]
4. Conflict exists, find next min in same submatrix: 9309.2 gc [2,3], lc [0,1]
5. Find next submatrix min: 9420 gc [0,3] lc[0,1]
6. Conflict exists, find next min: 26757.01 gc [1,2] lc [1,0]
7. Find next submatrix min: 14604  conflict with lc[1,1]
8. Find next submatrix min: 15108.2  conflict with lc [0,1]
9. Find next submatrix min: 32673. gc [0,0], lc [0,0]
[((0, 0), (0, 0), 32673), ((0, 1), (1, 0), 26757.01), ((1, 0), (1, 1), 8935.2), ((1, 1), (0, 1), 9309.2)]
def get_mins(self, result):
# result is the 2d array
dim = 2 # 2x2 submatrix
mins = []
count = 0
while count < dim**2:
a, b = result.shape
M4D = result.reshape(a//dim, dim, b//dim, dim)
lidx = M4D.transpose(0, 2, 1, 3).reshape(1, b//dim, dim**2).argmin(1)
r, c = numpy.unravel_index(lidx, [dim, dim])
yy = M4D.min(axis=(1, 3))
ww = numpy.dstack((r, c))
super_min = numpy.unravel_index(numpy.argmin(yy), (dim, dim))
rows = super_min[0]
cols = super_min[1]
# ww[rows,cols] g_ves us 2x2 position
offset_r, offset_c = ww[rows, cols]
# super_min gives us submatrix position
mins.append((tuple(super_min), (offset_r, offset_c), yy.min()))
if dim > 1:
# update all other positions with inf >> max(result)
result[numpy.ix_([offset_r + (d * dim) for d in range(dim)], [offset_c + (d * dim) for d in range(dim)])] = numpy.inf
# update the submatrix to all == numpy.inf
result[rows*dim:((rows*dim)+dim), cols*dim:((cols*dim)+dim)] = numpy.inf
count += 1
return mins
Given the dependency between iterations in choosing the global minimum, here's an approach with oneloop 
def unq_localmin(A, dim):
m, n = A.shape
M4D = A.reshape(m//dim, dim, n//dim, dim)
M2Dr = M4D.swapaxes(1,2).reshape(1,dim**2)
a = M2Dr.copy()
N = M2Dr.shape[0]
R = np.empty(N,dtype=int)
C = np.empty(N,dtype=int)
shp = M2Dr.shape
for i in range(N):
r,c = np.unravel_index(np.argmin(a),shp)
a[r] = np.inf
a[:,c] = np.inf
R[i], C[i] = r, c
out = M2Dr[R,C]
idr = np.column_stack(np.unravel_index(R,(dim,dim)))
idc = np.column_stack(np.unravel_index(C,(dim,dim)))
return zip(map(tuple,idr),map(tuple,idc),out)
Let's verify results with a random bigger 9x9
array and 3x3
submatrix/subarray to test out variety against OP's implementation get_mins

In [66]: A # Input data array
Out[66]:
array([[ 927., 852., 18., 949., 933., 558., 519., 118., 82.],
[ 939., 782., 178., 987., 534., 981., 879., 895., 407.],
[ 968., 187., 539., 986., 506., 499., 529., 978., 567.],
[ 767., 272., 881., 858., 621., 301., 675., 151., 670.],
[ 874., 221., 72., 210., 273., 823., 784., 289., 425.],
[ 621., 510., 303., 935., 88., 970., 278., 125., 669.],
[ 702., 722., 620., 51., 845., 414., 154., 154., 635.],
[ 600., 928., 540., 462., 772., 487., 196., 499., 208.],
[ 654., 335., 258., 297., 649., 712., 292., 767., 819.]])
In [67]: unq_localmin(A, dim = 3) # Using proposed approach
Out[67]:
[((0, 0), (0, 2), 18.0),
((2, 1), (0, 0), 51.0),
((1, 0), (1, 2), 72.0),
((1, 1), (2, 1), 88.0),
((0, 2), (0, 1), 118.0),
((2, 2), (1, 0), 196.0),
((2, 0), (2, 2), 258.0),
((1, 2), (2, 0), 278.0),
((0, 1), (1, 1), 534.0)]
In [68]: out = np.empty((9,9))
In [69]: get_mins(out,A) # Using OP's soln with dim = 3 edited
Out[69]:
[((0, 0), (0, 2), 18.0),
((2, 1), (0, 0), 51.0),
((1, 0), (1, 2), 72.0),
((1, 1), (2, 1), 88.0),
((0, 2), (0, 1), 118.0),
((2, 2), (1, 0), 196.0),
((2, 0), (2, 2), 258.0),
((1, 2), (2, 0), 278.0),
((0, 1), (1, 1), 534.0)]
Simplification
The above solution gets us the row and col indices that could be used to construct the indices tuples as printed with get_mins
. If you don't need those, we could simplify the proposed approach a bit, like so 
def unq_localmin_v2(A, dim):
m, n = A.shape
M4D = A.reshape(m//dim, dim, n//dim, dim)
M2Dr = M4D.swapaxes(1,2).reshape(1,dim**2)
N = M2Dr.shape[0]
out = np.empty(N)
shp = M2Dr.shape
for i in range(N):
r,c = np.unravel_index(np.argmin(M2Dr),shp)
out[i] = M2Dr[r,c]
M2Dr[r] = np.inf
M2Dr[:,c] = np.inf
return out
Runtime test 
In [52]: A = np.random.randint(11,999,(9,9)).astype(float)
In [53]: %timeit unq_localmin_v2(A, dim=3)
10000 loops, best of 3: 93.1 µs per loop
In [54]: out = np.empty((9,9))
In [55]: %timeit get_mins(out,A)
1000 loops, best of 3: 907 µs per loop