Nick Nick - 3 months ago 9
Python Question

Slicing is adding a 3rd Dimension to my Array - Not sure why

I am trying to index the testdata with just the labels that equal 2 and 3. However, when I run this code, it turns my array from 2D (100 x 100) into 3D (100 x 1 x 100).

Can anyone explain why it is doing this? The last line in the code is the culprit, but I am not sure why it is happening.

labels = testdata[:,0]
num2 = numpy.nonzero(labels == 2)
num2 = numpy.transpose(num2)
num3 = numpy.nonzero(labels == 3)
num3 = numpy.transpose(num3)
num = numpy.vstack([num2,num3])
testdata = testdata[num,:]

Answer

When there are puzzles, print intermediate values. Better yet, run a test case in a interactive shell so you can check each value, and understand what is going on. Keep track of the shapes.

Looks like labels is a 1d array of numbers like:

In [212]: labels=np.array([0,1,2,2,3,2,0,3,2])

indexes where labels is 2 or 3:

In [213]: num2=np.nonzero(labels==2)
In [214]: num2
Out[214]: (array([2, 3, 5, 8], dtype=int32),)
In [215]: num3=np.nonzero(labels==3)

Here's a key step - what is the purpose of transpose. Note the num2 is a tuple with one 1d array.

In [216]: num2=np.transpose(num2)
In [217]: num3=np.transpose(num3)
In [218]: num2
Out[218]: 
array([[2],
       [3],
       [5],
       [8]], dtype=int32)

After the transpose num2 is a column array, (4,1) shape.

Joining them vertically produces a (6,1) array:

In [220]: num=np.vstack([num2,num3])
In [221]: num
Out[221]: 
array([[2],
       [3],
       [5],
       [8],
       [4],
       [7]], dtype=int32)
In [222]: num.shape
Out[222]: (6, 1)
In [223]: labels[num]
Out[223]: 
array([[2],
       [2],
       [2],
       [2],
       [3],
       [3]])
In [224]: labels[num].shape
Out[224]: (6, 1)

Indexing the 1d array with that array produces another array of the same shape as the index. Indexing x[num,:] does the same thing, but with the added last dimension.


If I index a (3,4) array with a (2,5) array in the 1st dimension, the result is a (2,5,4) array:

In [227]: np.ones((3,4))[np.ones((2,5),int),:].shape
Out[227]: (2, 5, 4)
Comments