imprela - 1 year ago 123
R Question

# Loop to change outliers of multiple variables to 95% in R

I have some outliers in my dataset. The variables of interest are named as

`j_q3_1, j_q3_2,...,j_q3_14`
and also
`j_q4_1, j_q4_2,...,j_q4_14`
. I want to change entries greater than the 95 percentile to the 95 percentile. I was wondering if I could create a loop that changes question number (q3 to q4) and also the last number after underscore (1 to 14). Any suggestions will be greatly appreciated.

Example data (only until _2 and q3 and q4 only):

``````    test <- data.frame(hhid = c(1:5), j_q3_1 =c(1000,1500,2000,5000,10000), j_q4_1=c(500,100,200,10000,200), j_q5_1 =c(200,300,400,203,100), j_q3_2 =c(300,10000,200,300,200), j_q4_2=c(100,200,320,120,302), j_q5_2=c(10000,120,1222,300,2333))
``````

This code works for me for every variable:

``````    quantiles <- quantile(test\$j_q3_1,c(0.95))
test\$j_q3_1[test\$j_q3_1 > quantiles[1]] <- quantiles[1]

quantiles <- quantile(test\$j_q4_1,c(0.95))
test\$j_q4_1[test\$j_q4_1 > quantiles[1]] <- quantiles[1]

quantiles <- quantile(test\$j_q3_2,c(0.95))
test\$j_q3_2[test\$j_q3_2 > quantiles[1]] <- quantiles[1]

quantiles <- quantile(test\$j_q4_2,c(0.95))
test\$j_q4_2[test\$j_q3_2 > quantiles[1]] <- quantiles[1]
``````

You can do it like this:

``````cname <- paste0("j_q", i, "_", j)
quantiles <- quantile(test[, cname], c(0.95))
test[test[, cname] > quantiles[1], cname] <- quantiles[1]
``````

If you have NA values:

``````cname <- paste0("j_q", i, "_", j)
quantiles <- quantile(test[, cname], c(0.95), na.rm = TRUE)
test[test[!is.na(test[, cname]), cname] > quantiles[1], cname] <- quantiles[1]
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download