cf2 - 5 months ago 48

Python Question

I'm looking to solve the following problem. I have a numpy array which is labeled to regions from 1 to n. Let's say this is the array:

`x = np.array([[1, 1, 1, 4], [1, 1, 2, 4], [1, 2, 2, 4], [5, 5, 3, 4]], np.int32)`

array([[1, 1, 1, 4],

[1, 1, 2, 4],

[1, 2, 2, 4],

[5, 5, 3, 4]])

A region are the combined cells in the numpy array with a unique value. So in this example x has 5 regions; region 1 which consists of 5 cells, region 2 which consists of 3 cells etc.

Now, I determine the adjacent regions of each region with the following lines of code:

`n = x.max()`

tmp = np.zeros((n+1, n+1), bool)

# check the vertical adjacency

a, b = x[:-1, :], x[1:, :]

tmp[a[a!=b], b[a!=b]] = True

# check the horizontal adjacency

a, b = x[:, :-1], x[:, 1:]

tmp[a[a!=b], b[a!=b]] = True

# register adjacency in both directions (up, down) and (left,right)

result = (tmp | tmp.T)

result = result.astype(int)

np.column_stack(np.nonzero(result))

resultlist = [np.flatnonzero(row) for row in result[1:]]

Which gives me a list of each region with its adjacent regions:

`[array([2, 4, 5], dtype=int64),`

array([1, 3, 4, 5], dtype=int64),

array([2, 4, 5], dtype=int64),

array([1, 2, 3], dtype=int64),

array([1, 2, 3], dtype=int64)]

Which works really well. However, I would like to count the amount of cells of each adjacent region and return this output. So, for region 2, in this example would mean a total of 7 adjacent regions (three 1s, two 4s, one 3 and one 5). Therefore:

- 2 is surrounded for 43% by 1
- 2 is surround for 14 % by 5
- 2 is surrounded for 14% by 3
- 2 is surrounded for 29% by 4

How could I best adjust the above code to include the amount of cells for each adjacent region?

Many thanks guys!

Answer

Here is a vectorized solution using the numpy_indexed package (note; it isn't vectorized over the regions, but it is vectorized over the pixels, which is the useful thing to do assuming n_pixels >> n_regions):

```
neighbors = np.array([x[:, :-1].flatten(), x[:, +1:].flatten(), x[+1:, :].flatten(), x[:-1, :].flatten()])
centers = np.array([x[:, +1:].flatten(), x[:, :-1].flatten(), x[:-1, :].flatten(), x[+1:, :].flatten()])
import numpy_indexed as npi
regions, neighbors_per_regions = npi.group_by(centers.flatten(), neighbors.flatten())
for region, neighbors_per_region in zip(regions, neighbors_per_regions):
print(region)
neighbors_per_region = neighbors_per_region[neighbors_per_region != region]
unique_neighbors, neighbor_counts = npi.count(neighbors_per_region)
print(unique_neighbors, neighbor_counts / neighbor_counts.sum() * 100)
```