I have several different data files that I need to import using genfromtxt. Each data file has different content. For example, file 1 may have all floats, file 2 may have all strings, and file 3 may have a combination of floats and strings etc. Also the number of columns vary from file to file, and since there are hundreds of files, I don't know which columns are floats and strings in each file. However, all the entries in each column are the same data type.
Is there a way to set up a converter for genfromtxt that will detect the type of data in each column and convert it to the right data type?
If you're able to use the Pandas library,
pandas.read_csv is much more generally useful than
np.genfromtxt, and will automatically handle the kind of type inference mentioned in your question. The result will be a dataframe, but you can get out a numpy array in one of several ways. e.g.
import pandas as pd data = pd.read_csv(filename) # get a numpy array; this will be an object array if data has mixed/incompatible types arr = data.values # get a record array; this is how numpy handles mixed types in a single array arr = data.to_records()
pd.read_csv has dozens of options for various forms of text inputs; see more in the pandas.read_csv documentation.