Ben Ben - 2 months ago 10
Python Question

How do I create an empty array/matrix in NumPy?

I can't figure out how to use an array or matrix in the way that I would normally use a list. I want to create an empty array (or matrix) and then add one column (or row) to it at a time.

At the moment the only way I can find to do this is like:

mat = None
for col in columns:
if mat is None:
mat = col
else:
mat = hstack((mat, col))


Whereas if it were a list, I'd do something like this:

list = []
for item in data:
list.append(item)


Is there a way to use that kind of notation for NumPy arrays or matrices?

Answer

You have the wrong mental model for using NumPy efficiently. NumPy arrays are stored in contiguous blocks of memory. If you want to add rows or columns to an existing array, the entire array needs to be copied to a new block of memory, creating gaps for the new elements to be stored. This is very inefficient if done repeatedly to build an array.

In the case of adding rows, your best bet is to create an array that is as big as your data set will eventually be, and then add data to it row-by-row:

>>> import numpy
>>> a = numpy.zeros(shape=(5,2))
>>> a
array([[ 0.,  0.],
   [ 0.,  0.],
   [ 0.,  0.],
   [ 0.,  0.],
   [ 0.,  0.]])
>>> a[0] = [1,2]
>>> a[1] = [2,3]
>>> a
array([[ 1.,  2.],
   [ 2.,  3.],
   [ 0.,  0.],
   [ 0.,  0.],
   [ 0.,  0.]])