Johannes Filter Johannes Filter - 3 months ago 16
R Question

Clean Data With R: ifelse is changing value of data frame

I have a column Age containing strings in the form of

"15 - 24"
,
"25 - 34"
etc. I want to rename some of my rows and I use the following snippet.

d2$age <- ifelse(d2$age %in% c("35 - 44", "45 - 54", "55 - 64", "65 +"), "35 +", d2$age)


It works in the sense, that it successfully substitutes the values of the rows specified in the condition. But it also changes other rows where the condition is false. So I think something with the else clause is wrong.
"15 - 24"
is changed to
"2"
and
"25 - 34"
is changed to
"3"
. What did I do wrong?

Answer

The reason could be that the column is factor and within ifelse, it gets coerced to the integer storage mode. One way to prevent it is by converting to character with as.character

d2$age <- as.character(d2$age)
d2$age <- ifelse(d2$age %in% 
           c("35 - 44", "45 - 54", "55 - 64", "65 +"), "35 +", d2$age)

Or instead of ifelse, we can use the index method

i1 <- d2$age %in%  c("35 - 44", "45 - 54", "55 - 64", "65 +")
d2$age[i1] <- "35 +" 

Or if we don't want to change the factor class to character then work with the levels

i2 <- levels(d2$age) %in% c("35 - 44", "45 - 54", "55 - 64", "65 +")
levels(d2$age)[i2] <- "35 +"

data

set.seed(24)
d2 <- data.frame(age = c("5 - 10", "35 - 44", "25 - 34", "45 - 54", "5 - 10", 
     "55 - 64", "25 - 34", "65 +"), val = rnorm(8))