Thom Chubb Thom Chubb - 1 month ago 13
Python Question

Python numpy masked array initialization

I used masked arrays all the time in my work, but one problem I have is that the initialization of masked arrays is a bit clunky. Specifically, the ma.zeros() and ma.empty() return masked arrays with a mask that doesn't match the array dimension. The reason I want this is so that if I don't assign to a particular element of my array, it is masked by default.

In [4]: A=ma.zeros((3,))
...
masked_array(data = [ 0. 0. 0.],
mask = False,
fill_value = 1e+20)


I can subsequently assign the mask:

In [6]: A.mask=ones((3,))
...
masked_array(data = [-- -- --],
mask = [ True True True],
fill_value = 1e+20)


But why should I have to use two lines to initialize and array? Alternatively, I can ignore the ma.zeros() functionality and specify the mask and data in one line:

In [8]: A=ma.masked_array(zeros((3,)),mask=ones((3,)))
...
masked_array(data = [-- -- --],
mask = [ True True True],
fill_value = 1e+20)


But I think this is also clunky. I have trawled through the
numpy.ma
documentation but I can't find a neat way of dealing with this. Have I missed something obvious?

Answer

Well, the mask in ma.zeros is actually a special constant, ma.nomask, that corresponds to np.bool_(False). It's just a placeholder telling NumPy that the mask hasn't been set. Using nomask actually speeds up np.ma significantly: no need to keep track of where the masked values are if we know beforehand that there are none.

The best approach is not to set your mask explicitly if you don't need it and leave np.ma set it when needed (ie, when you end up trying to take the log of a negative number).


Side note #1: to set the mask to an array of False with the same shape as your input, use

np.ma.array(..., mask=False)

That's easier to type. Note that it's really the Python False, not np.ma.nomask... Similarly, use mask=True to force all your inputs to be masked (ie, mask will be a bool ndarray full of True, with the same shape as the data).


Side note #2: If you need to set the mask after initialization, you shouldn't use an assignment to .mask but assign to the special value np.ma.masked, it's safer:

a[:] = np.ma.masked