broncoAbierto - 5 months ago 42
Python Question

Matrix indexing in Numpy

I was growing confused during the development of a small Python script involving matrix operations, so I fired up a shell to play around with a toy example and develop a better understanding of matrix indexing in Numpy.

This is what I did:

``````>>> import numpy as np
>>> A = np.matrix([1,2,3])
>>> A
matrix([[1, 2, 3]])
>>> A[0]
matrix([[1, 2, 3]])
>>> A[0][0]
matrix([[1, 2, 3]])
>>> A[0][0][0]
matrix([[1, 2, 3]])
>>> A[0][0][0][0]
matrix([[1, 2, 3]])
``````

As you can imagine, this has not helped me develop a better understanding of matrix indexing in Numpy. This behavior would make sense for something that I would describe as "An array of itself", but I doubt anyone in their right mind would choose that as a model for matrices in a scientific library.

What is, then, the logic to the output I obtained? Why would the first element of a matrix object be itself?

PS: I know how to obtain the first entry of the matrix. What I am interested in is the logic behind this design decision.

EDIT: I'm not asking how to access a matrix element, or why a matrix row behaves like a matrix. I'm asking for a definition of the behavior of a matrix when indexed with a single number. It's an action typical of arrays, but the resulting behavior is nothing like the one you would expect from an array. I would like to know how this is implemented and what's the logic behind the design decision.

Look at the shape after indexing:

``````In [295]: A=np.matrix([1,2,3])
In [296]: A.shape
Out[296]: (1, 3)
In [297]: A[0]
Out[297]: matrix([[1, 2, 3]])
In [298]: A[0].shape
Out[298]: (1, 3)
``````

The key to this behavior is that `np.matrix` is always 2d. So even if you select one row (`A[0,:]`), the result is still 2d, shape `(1,3)`. So you can string along as many `[0]` as you like, and nothing new happens.

What are you trying to accomplish with `A[0][0]`? The same as `A[0,0]`? For the base `np.ndarray` class these are equivalent.

Note that `Python` interpreter translates indexing to `__getitem__` calls.

`````` A.__getitem__(0).__getitem__(0)
A.__getitem__((0,0))
``````

`[0][0]` is 2 indexing operations, not one. So the effect of the second `[0]` depends on what the first produces.

For an array `A[0,0]` is equivalent to `A[0,:][0]`. But for a matrix, you need to do:

``````In [299]: A[0,:][:,0]
Out[299]: matrix([[1]])  # still 2d
``````

=============================

"An array of itself", but I doubt anyone in their right mind would choose that as a model for matrices in a scientific library.

What is, then, the logic to the output I obtained? Why would the first element of a matrix object be itself?

In addition, A[0,:] is not the same as A[0]

In light of these comments let me suggest some clarifications.

`A[0]` does not mean 'return the 1st element'. It means select along the 1st axis. For a 1d array that means the 1st item. For a 2d array it means the 1st row. For `ndarray` that would be a 1d array, but for a `matrix` it is another `matrix`. So for a 2d array or matrix, `A[i,:]` is the same thing as `A[i]`.

`A[0]` does not just return itself. It returns a new matrix. Different `id`:

``````In [303]: id(A)
Out[303]: 2994367932
In [304]: id(A[0])
Out[304]: 2994532108
``````

It may have the same data, shape and strides, but it's a new object. It's just as unique as the `ith` row of a many row matrix.

Most of the unique `matrix` activity is defined in: `numpy/matrixlib/defmatrix.py`. I was going to suggest looking at the `matrix.__getitem__` method, but most of the action is performed in `np.ndarray.__getitem__`.

`np.matrix` class was added to `numpy` as a convenience for old-school MATLAB programmers. `numpy` arrays can have almost any number of dimensions, 0, 1, .... MATLAB allowed only 2, though a release around 2000 generalized it to 2 or more.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download