JoVe JoVe - 3 months ago 14
Python Question

python fancy indexing with a boolean masked array

I have a numpy masked array of data:

data = masked_array(data = [7 -- 7 1 8 -- 1 1 -- -- 3 -- -- 3 --],
mask = [False True False False False True False False True True False True True False True])


I have a flag of a specific type of data, which is a boolean masked array:

flag = masked_array(data = [True False False True -- -- -- False -- True -- -- -- -- True],
mask = [False False False False True True True False True False True True True True False])


I want to do something like
data[flag]
and get the following output:

output_wanted = [7 1 -- --]


which corresponds to the data elements where the flag is True. Instead I get this:

output_real = [7 -- 7 1 8 -- 1 1 -- -- 3 -- -- 3 --]


I did not copied the masks of the outputs for better clarity.

I dont mind having an output with the size of the flag as long as it selects the data I want (the one corresponding to the True values of the flag). But I cannot figure out why it gives theses values in the real output !

Answer

If I reconstruct your arrays with:

In [28]: d=np.ma.masked_equal([7,0,7,1,8,0,1,1,0,0,3,0,0,3,0],0)

In [29]: f=np.ma.MaskedArray([True,False,False,True, False,False,False,False,True,True,True,True,True,True,True],[False, False, False, False, True, True, True, False, True, False, True, True, True, True, False])

In [30]: d
Out[30]: 
masked_array(data = [7 -- 7 1 8 -- 1 1 -- -- 3 -- -- 3 --],
             mask = [False  True False False False  True False False  True  True False  True
  True False  True],
       fill_value = 0)

In [31]: f
Out[31]: 
masked_array(data = [True False False True -- -- -- False -- True -- -- -- -- True],
             mask = [False False False False  True  True  True False  True False  True  True
  True  True False],
       fill_value = True)

The masked displays match, but I'm guessing at what the masked values are.

In [32]: d[f]
Out[32]: 
masked_array(data = [7 1 -- -- 3 -- -- 3 --],
             mask = [False False  True  True False  True  True False  True],
       fill_value = 0)

In [33]: d[f.data]
Out[33]: 
masked_array(data = [7 1 -- -- 3 -- -- 3 --],
             mask = [False False  True  True False  True  True False  True],
       fill_value = 0)

Indexing the f is the same as indexing with its data attribute. Its mask does nothing. Evidently my masked values are different from yours.

But if I index with a filled array, I get the desired array:

In [34]: d[f.filled(False)]
Out[34]: 
masked_array(data = [7 1 -- --],
             mask = [False False  True  True],
       fill_value = 0)

filled is used a lot in np.ma code, with differing fill values depending on the np operation (e.g. 0 for sum v 1 for product). Masked arrays don't usually iterate over their values skipping the masked ones; instead they convert the masked ones to innocuous values, and use regular numpy operations. The other strategy is to remove the masked values with compressed.

indices = np.where(flag.filled(False)) is mentioned in another answer, but plain boolean form works just as well.

A masked array has a data and mask attribute. Masking does not change the data values directly. That task is left to methods like filled.

Comments