Brandon Bertelsen Brandon Bertelsen - 3 years ago 91
R Question

Recoding variables with R

Recoding variables in R, seems to be my biggest headache. What functions, packages, processes do you use to ensure the best result?

I've found very few useful examples on the Internet that give a one-size-fits-all solution to recoding and I'm interested to see what you guys and gals are using.

Note: This may be a community wiki topic.

Answer Source

Recoding can mean a lot of things, and is fundamentally complicated.

Changing the levels of a factor can be done using the levels function:

> #change the levels of a factor
> levels(veteran$celltype) <- c("s","sc","a","l")

Transforming a continuous variable simply involves the application of a vectorized function:

mtcars$mpg.log <- log(mtcars$mpg)

For binning continuous data look at cut and cut2 (in the hmisc package). For example:

> #make 4 groups with equal sample sizes
> mtcars[['']] <- cut2(mtcars[['mpg']], g=4)
> #make 4 groups with equal bin width
> mtcars[['mpg.tr2']] <- cut(mtcars[['mpg']],4, include.lowest=TRUE)

For recoding continuous or factor variables into a categorical variable there is recode in the car package and recode.variables in the Deducer package

> mtcars[c("mpg.tr2")] <- recode.variables(mtcars[c("mpg")] , "Lo:14 -> 'low';14:24 -> 'mid';else -> 'high';")

If you are looking for a GUI, Deducer implements recoding with the Transform and Recode dialogs:

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download