Rohan Bapat Rohan Bapat - 7 months ago 36
Python Question

Remove special characters from column headers

I have a dictionary (data_final) of dataframes (health, education, economy,...). The dataframes contain data from one xlsx file. In one of the dataframes (economy), the column names have brackets and single quotes added to it.

data_final['economy'].columns =
Index([ ('Sr.No.',),
('DistrictName',),
('Agriculture',),
('Forestry& Logging',),
('Fishing',),
('Mining &Quarrying',),
('ManufacturingMFG.',),
('RegisteredMFG.',),
('Unregd. MFG.',),
('Electricity,Gas & W.supply',),
('Construction',),
('Trade,Hotels& Restaurants',),
('Railways',),
('Transportby other means',),
('Storage',),
('Communication',),
('Banking &Insurance',),
('Real, Ownership of Dwel. B.Ser.& Legal',),
('PublicAdministration',),
('OtherServices',),
('TotalDDP',),
('Population(In '00)',),
('Per CapitaIncome(Rs.)',)],
dtype='object')


I cannot reference any column using

data_final['economy']['('Construction',)']


gives error -

SyntaxError: invalid syntax


I tried to use replace to remove the brackets -

data_final['economy'].columns = pd.DataFrame(data_final['economy'].columns).replace("(","",regex=True))


But this does not remove the error in column names. How can i remove all these special characters from column names?

Ed. Ed.
Answer

It looks as though your column names are being imported/created as tuples. What happens if you try and reference them removing the brackets, but leaving a comma on the end, like so

data_final['economy']['Construction',]

or even with the brackets

data_final['economy'][('Construction',)]
Comments