I met a weird problem when I am using R, I'm using data.table:
Here, when I tried to convert those Province has count under 500 to "Other", the output changes the top count Provinces into index number
df <- fact_data[,.N,Province][N >= 500]$Province
fact_data[,Province := ifelse(Province %in% df, fact_data$Province, "Other")]
This is probably a side effect of
ifelse, which has a bad habit of changing the class of its return value unpredictably. Try this instead:
fact_data[ !( Province %in% df ), Province := "Other" ]
Generally, I would recommend working with character vectors as data.table columns instead of factors whenever possible.