Michael Jancen-Widmer Michael Jancen-Widmer - 1 month ago 7
Python Question

Python Numpy can't convert Strings to Integers from CSV file

I read a CSV File and everythings works except the conversion of the values to integers, since all the values there are strings. I tried to convert column-wise in a loop like this:

counter = 0
while counter < len(data):
try:
data[counter,0] = data[counter,0].astype(int) # ID
data[counter,1] = data[counter,1].astype(int) # Survived
except ValueError:
pass
counter = counter + 1


As you can see it is the titanic dataset I try to work with.

print (type(data[0,0]))


And printing the type of a value gives me
<class 'numpy.str_'>


How do I properly convert the columns to integers? Thanks in advance!

Answer

The problem is you're trying to change 1 item at a time without changing the dtype of data. Note that data.dtype tells you the type of the ndarray and you can't change that one cell at a time - the entire ndarray has a single type. Try this instead: data = data.astype(int). That will convert all rows and all columns to integers at once.

Comments