bobo - 5 months ago 23

Python Question

I've tried searching StackOverflow, googling, and even using symbolhound to do character searches, but was unable to find an answer. Specifically, I'm confused about Ch. 1 of Nielsen's *Neural Networks and Deep Learning*, where he says "It is assumed that the input

`a`

`(n, 1) Numpy ndarray`

`(n,) vector`

At first I thought

`(n,)`

`(n,)`

`(n, 1)`

For reference

`a`

EDIT: This question equivocates between a "one-column vector" (there's no such thing) and a "one-column matrix" (does actually exist). Same for "one-row vector" and "one-row matrix".

A vector is only a list of numbers, or (equivalently) a list of scalar transformations on the basis vectors of a vector space. A vector might

Be aware that in neither case are we discussing a one-dimensional vector, which would be a vector defined by only one number (unless, trivially, n==1, in which case the concept of a "column" or "row" distinction would be meaningless).

Answer

In `numpy`

an array can have a number of different dimensions, 0, 1, 2 etc.

The typical 2d array has dimension `(n,m)`

(this is a Python tuple). We tend to describe this as having n rows, m columns. So a `(n,1)`

array has just 1 column, and a `(1,m)`

has 1 row.

But because an array may have just 1 dimension, it is possible to have a shape `(n,)`

(Python notation for a 1 element tuple: see here for more).

For many purposes `(n,)`

, `(1,n)`

, `(n,1)`

arrays are equivalent (also `(1,n,1,1)`

(4d)). They all have `n`

terms, and can be reshaped to each other.

But sometimes that extra `1`

dimension matters. A (1,m) array can multiply a (n,1) array to produce a (n,m) array. A (n,1) array can be indexed like a (n,m), with 2 indices, `x[:,0]`

where as a (n,) only accepts `x[0]`

.

MATLAB matrices are always 2d (or higher). So people transfering ideas from MATLAB tend to expect 2 dimensions. There is a `np.matrix`

subclass that supposed to imitate that.

For numpy programmers the distinctions between vector, row vector, column vector, matrix are loose and relatively unimportant. Or the use is derived from the application rather than from `numpy`

itself. I think that's what's happening with this network book - the notation and expectations come from outside of `numpy`

.