I use GradientBoosting classifier to predict gender of users. The data have a lot of predictors and one of them is the country. For each country I have binary column. There are always only one column set to 1 for all country columns. But such desicion is very slow from computation point of view. Is there any way to represent country columns with only one column? I mean correct way.
You can replace the binary variable with the actual country name then collapse all of these columns into one column. Use
LabelEncoder on this column to create a proper integer variable and you should be all set.