I have a ts dataset with many columns. For each column, I want to substitute the values which equal their lag with the same value plus a very small term (any kind of noise), let's say a fraction of the standard deviation.
I wrote the function and used the easy apply function.
a <- c(1,2,2,3,4,5,6)
b <- c(4,5,6,7,8,8,9)
data <- data.frame(cbind(a,b))
repetitions <- function(x) {
x[x == lag(x) & !is.na(x) & !is.na(lag(x))] <- x+0.000001
x
}
datanew <- data.frame(apply(data, 2, repetitions ))
SOLVED
The problem as @cerpintax was saying, is a matter of different length: it is sufficient to condition the replacement in order to get it right.
Thank you very much @jason: your solution worked but I found a bug: when I used your code on the larger dataset, I got some NA instead of the replacement (don't know why).
Here's the working code, very simple! I just hate myself for spending so much time on this tiny bit..
repetitions <- function(x) {
x[x == lag(x) & !is.na(x) & !is.na(lag(x))] <- x[x == lag(x) & !is.na(x) & !is.na(lag(x))] + (0.0001*sd(x, na.rm = T))
x
}
ITA_HD6 <- data.frame(apply(ITA_HD5, 2, repetitions))