Wilmar van Ommeren - 7 months ago 48

Python Question

I have a large numpy 2d array (10000,10000) in which regions (clusters of cells with the same number) are randomly labeled. As a result, some separate regions were assigned to the same label. What I would like is to relabel the numpy 2d array so that all separate regions are assigned to a unique label (see example).

I now how to solve this problem with a loop. But as I am working with a large array with a lot of small regions, this process takes ages. Therefore, a vectorized approach would be more suitable.

Example:

-Two separate regions are labeled with 1

-Two separate regions are

labeled with 3

`## Input`

random_arr=np.array([[1,1,3,3],[1,2,2,3],[2,2,1,1],[3,3,3,1]])

`## Apply function`

unique_arr=relabel_regions(random_arr)

## Output

>>> unique_arr

array([[1, 1, 3, 3],

[1, 2, 2, 3],

[2, 2, 4, 4],

[5, 5, 5, 4]])

Slow solution with loop:

`def relabel_regions(random_regions):`

# Locate random regions index

random_labs=np.unique(random_regions)

unique_segments=np.zeros(np.shape(random_regions),dtype='uint64')

count=0

kernel=np.array([[0,1,0],[1,1,1],[0,1,0]],dtype='uint8')

# Assign unique number to each random labeled region

for i in range(len(random_labs)):

mask=np.zeros(np.shape(random_regions))

mask[np.where(random_regions==random_labs[i])]=1

labeled_mask, freq = ndimage.label(mask, structure=kernel)

labeled_mask=labeled_mask+count

unique_segments[np.where(labeled_mask>0+count)]=labeled_mask[np.where(labeled_mask>0+count)]

count+=freq

return unique_segments

Answer

Let's cheat and just use some high-quality library (scikit-image) which offers exactly this.

You may learn from it's implementation or just use it!

```
import numpy as np
from skimage.measure import label
random_arr = np.array([[1,1,3,3],[1,2,2,3],[2,2,1,1],[3,3,3,1]])
labels = label(random_arr, connectivity=1) # neighborhood-definition here!
print(labels)
```

```
[[1 1 2 2]
[1 3 3 2]
[3 3 4 4]
[5 5 5 4]]
```

**EDIT:** Like mentioned by Jeon in the comments, scipy's scipy.ndimage.measurements.label might also be a candidate if one does not want to use one more extra library! Thanks for the comment Jeon!