I have a large numpy 2d array (10000,10000) in which regions (clusters of cells with the same number) are randomly labeled. As a result, some separate regions were assigned to the same label. What I would like is to relabel the numpy 2d array so that all separate regions are assigned to a unique label (see example).
I now how to solve this problem with a loop. But as I am working with a large array with a lot of small regions, this process takes ages. Therefore, a vectorized approach would be more suitable.
-Two separate regions are labeled with 1
-Two separate regions are
labeled with 3
## Apply function
array([[1, 1, 3, 3],
[1, 2, 2, 3],
[2, 2, 4, 4],
[5, 5, 5, 4]])
# Locate random regions index
# Assign unique number to each random labeled region
for i in range(len(random_labs)):
labeled_mask, freq = ndimage.label(mask, structure=kernel)
Let's cheat and just use some high-quality library (scikit-image) which offers exactly this.
You may learn from it's implementation or just use it!
import numpy as np from skimage.measure import label random_arr = np.array([[1,1,3,3],[1,2,2,3],[2,2,1,1],[3,3,3,1]]) labels = label(random_arr, connectivity=1) # neighborhood-definition here! print(labels)
[[1 1 2 2] [1 3 3 2] [3 3 4 4] [5 5 5 4]]
EDIT: Like mentioned by Jeon in the comments, scipy's scipy.ndimage.measurements.label might also be a candidate if one does not want to use one more extra library! Thanks for the comment Jeon!