I can't post the data being imported, because it's too much. But, it has both number and string fields and is 5543 rows and 137 columns. I import data with this code (ndnames and ndtypes holds the column names and column datatypes):
npArray2 = np.genfromtxt(fileName,
pdArray = pd.read_csv(fileName,
npArray = pd.DataFrame.as_matrix(pdArray)
npArray2 is a 1d structured array, 5543 elements and 137 fields.
npArray2.dtype look like, or equivalently what is
ndtypes, because the
dtype is built from the types and names that you provided. "void7520" is a way of identifying a record of this array, but tells us little except the size (in bytes?).
If all fields of the dtype are numeric, even better yet if they are all the same numeric dtype (int, float), then it is fairly easy to convert it to a 2d array with 137 columns (2nd dim).
view can be used.
it has both number and string fields - you can't convert it to a 2d array of numbers; it could be an array of strings, but you can't do numeric math on strings.)
But if the dtypes are mixed then you can't convert it. All elements of the 2d array have be the same dtype. You have to use the structured array approach if you want mixed types. (well there is the
dtype=object, but let's not go there).
pandas is going the
object route. Evidently it thinks the only way to make an array from this data is to let each element be its own type. And the math of object arrays is severely limited. They are, in effect a glorified, or debased, list.