Mohit Vellanki Mohit Vellanki - 2 months ago 10
Python Question

How to convert the data as following in python?

I have some data in the following format in a csv file.

Id Category
1 A
2 B
3 C
4 B
5 C
6 d


I'd like to convert it into the below format and save it another csv file

Id A B C D E
1 1 0 0 0 0
2 0 1 0 0 0
3 0 0 1 0 0
4 0 1 0 0 0
5 0 0 1 0 0
6 0 0 0 1 0

Answer

Try with pd.get_dummies()

>> df = pd.read_csv(<path_to_file>, sep=',', encoding='utf-8', header=0)

>> df
   Id   Category
0   1          A
1   2          B
2   3          C
3   4          B
4   5          C
5   6          d

>> pd.get_dummies(df.Category)

This will encode Category and give you new columns:

A B C d

But will not 'fix' d -> D and will not give you any columns that can not be deduced from the values you have in Category.

I suggest you check the solution posted in the comment earlier for that.