Fourier Fourier - 1 month ago 21
Python Question

Shape of a structured array in numpy

I am trying to preallocate an empty array and at the same time defining the data type with a size of 19x5 using the following code:

import numpy as np
arr=np.empty((19,5),dtype=[('a','|S1'),('b', 'f4'),('c', 'i'),('d', 'f4'),('e', 'f4')])

The result is somewhat unexpected, yielding a 19*5*5 array.
However, trying:

arr=np.empty((19,1),dtype=[('a','|S1'),('b', 'f4'),('c', 'i'),('d', 'f4'),('e', 'f4')])

gives the proper length per row (5 fields), which apparently looks like a 1D array.

When I am trying to write this, only this formatting is allowed:

np.savetxt(file, arr, delimiter=',', fmt='%s')

This tells me I am dealing with a single string.
Is there no way to get a 19x5 shaped structured array that is not flattened?

The main problem arises when writing this with savetxt. I want to have a csv file that has all the 5 column values. As this is handled as a string it gives the wrong output.


Typically the fields of a structured array replace the columns of a 2d array. Often people load a csv with genfromtxt and wonder why the result is 1d. As you found you can make a 2d array with a compound dtype, but each element will have multiple values - as specified by the dtype.

Normally you'd initialize that array with a 1d shape, e.g. (19,).

Note that you have to fill values by field or with a list of tuples.

I don't have experience using savetxt with a structured array, and can't run tests on this tablet. But there probably are SO questions that help.

savetxt iterates on an array, and writes fmt%tuple(row), where fmt is built from your input.

I'd suggest trying fmt='%s %s. %s. %s %s' - a % format for each field in the dtype. See its docs. Also I don't know if a (19,) array will behave better than a (19,1).

Experiment with formatting elements of your array. They should look like tuples to the formatter. If not try tolist() or tuple(A[0]).

Here's answer that is almost good enough to be a duplicate

 ab = np.zeros(names.size, dtype=[('var1', 'S6'), ('var2', float)])
 np.savetxt('test.txt', ab, fmt="%10s %10.3f")


savetxt can only handle a 1d structured array, because of the tuple formatting.