Jack Jack - 1 year ago 84
R Question

Replace single values according to lagged value in column in R

I have a ts dataset with many columns. For each column, I want to substitute the values which equal their lag with the same value plus a very small term (any kind of noise), let's say a fraction of the standard deviation.
I wrote the function and used the easy apply function.

a <- c(1,2,2,3,4,5,6)
b <- c(4,5,6,7,8,8,9)
data <- data.frame(cbind(a,b))
repetitions <- function(x) {
x[x == lag(x) & !is.na(x) & !is.na(lag(x))] <- x+0.000001
datanew <- data.frame(apply(data, 2, repetitions ))

If I use a single number it works e.g. 1000, while if I put x+0.000001 it returns wrong numbers.
I know that the solution is not very difficult, but I've found NA issues only, and I'm pretty stucked at this point of the program.

Thank you very much for your help.

EDIT. I hope the mwe is correct, I'm a newbie of this

Answer Source


The problem as @cerpintax was saying, is a matter of different length: it is sufficient to condition the replacement in order to get it right.

Thank you very much @jason: your solution worked but I found a bug: when I used your code on the larger dataset, I got some NA instead of the replacement (don't know why).

Here's the working code, very simple! I just hate myself for spending so much time on this tiny bit..

repetitions <- function(x) {
x[x == lag(x) & !is.na(x) & !is.na(lag(x))] <- x[x == lag(x) & !is.na(x) &   !is.na(lag(x))] + (0.0001*sd(x, na.rm = T))
ITA_HD6 <- data.frame(apply(ITA_HD5, 2, repetitions))
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download