inja inja - 1 year ago 76
Python Question

Pandas DataFrame- Finding Index Value for a Column

I have a DataFrame that has columns such as ID, Name, Specification, Time.

my file path to open them

mc = pd.read_csv("C:\\data.csv", sep = ",", header = 0, dtype = str)

When I checked my columns values, using


I found my ID had it with a weird character looked like this,

['/ufeffID', 'Name', 'Specification', 'Time']

After this I assigned that columns with ID like this,

mc.columns.values[0] = "ID"

When I checked this using


I got my result as,

Array(['ID', 'Name', 'Specification', 'Time'])

Then, I checked with,

"ID" in mc.columns.values

it gave me

Then I tried,


I got an error stating like this,

keyError 'ID'.

I want to get the values of ID column and get rid of that weird characters in front of ID column? Is there any way to solve that? Any help would be appreciated. Thank you in advance.

Answer Source

That's utf-16 BOM, pass encoding='utf-16' to read_csv see:

mc = pd.read_csv("C:\\data.csv", sep = ",", header = 0, dtype = str, encoding=utf-16')

the above should work FE FF is the BOM for utf-16 Big endian to be specific

Also you should use rename rather than try to overwrite the np array value:

mc.rename(columns={mc.columns[0]: "ID"}, inplace=True)

should work correctly

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download