mkt mkt - 3 months ago 8
R Question

Jitter points by different amounts based on condition

I have a dataset with discrete X-axis values and a large number of Y-values. I also have a separate vector with measures of uncertainty in the X-axis values; this uncertainty varies across the X axis. I would like to jitter my X-axis values by an amount proportional to this uncertainty measure. It's easy but cumbersome to do this with a loop; I am looking for an efficient solution to this.

Reproducible example:

#Create data frame with discrete X-axis values (a)
dat <- data.frame(a = c(rep(5, 5), rep(15,5), rep(25,5)),
b = c(runif(5, 1, 2), runif(5, 2, 3), runif(5, 3, 4)))

#Plot raw, unjittered data
plot(dat$b ~ dat$a, data = dat, col = as.factor(dat$a), pch = 20, cex = 2)
[![unjittered ][1]][1]

#vector of uncertainty estimates
wid_vec <- c(1,10,3)

#Ugly manual jittering, not feasible for large datasets but
#produces the desired result
dat$a_jit <- c(jitter(rep(5, 5), amount = 1),
jitter(rep(15, 5), amount = 10),
jitter(rep(25, 5), amount = 3))

plot(dat$b ~ dat$a_jit, col = as.factor(dat$a), pch = 20, cex = 2)
[![manually jittered][1]][1]

#Ugly loop solution, also works

newdat <- data.frame()
a_s <- unique(dat$a)

for (i in 1:length(a_s)){
subdat <- dat[dat$a == a_s[i],]
subdat$a_jit <- jitter(subdat$a, amount = wid_vec[i])
newdat <- rbind(newdat, subdat)
}

plot(newdat$b ~ newdat$a_jit, col = as.factor(newdat$a), pch = 20, cex = 2)

#Trying to make a vectorized solution, but this of course does not work.

jitter_custom <- function(x, wid){
j <- x + runif(length(x), -wid, wid)
j
}

#runif() does not work this way, this is shown to indicate the direction
#I've been attempting


Basically, I need to split up dat by condition, call the relevant entry in the wid_vec vector, then create a new column by modifying the dat entries based on the wild_vec value. It sounds like there ought to be an elegant dplyr solution for this, but it eludes me right now.

Appreciate all suggestions!

Answer

As an alternative to

set.seed(1)
dat$a_jit <- c(jitter(rep(5, 5), amount = 1), 
                jitter(rep(15, 5), amount = 10), 
                jitter(rep(25, 5), amount = 3))

you could do

set.seed(1)
x <- with(dat, jitter(a, amount=setNames(c(1,10,3), unique(a))[as.character(a)]))

The result is the same:

identical(x, dat$a_jit)
# [1] TRUE

If you want the warning to vanish, you could wrap suppressWarnings() around jitter(...), or use something like with(dat, mapply(jitter, x=a, amount=setNames(c(1,10,3), unique(a))[as.character(a)])).

Comments