Jack - 1 year ago 58

R Question

I have a ts dataset with many columns. For each column, I want to substitute the values which equal their lag with the same value plus a very small term (any kind of noise), let's say a fraction of the standard deviation.

I wrote the function and used the easy apply function.

`a <- c(1,2,2,3,4,5,6)`

b <- c(4,5,6,7,8,8,9)

data <- data.frame(cbind(a,b))

repetitions <- function(x) {

x[x == lag(x) & !is.na(x) & !is.na(lag(x))] <- x+0.000001

x

}

datanew <- data.frame(apply(data, 2, repetitions ))

If I use a single number it works e.g. 1000, while if I put

I know that the solution is not very difficult, but I've found NA issues only, and I'm pretty stucked at this point of the program.

Thank you very much for your help.

EDIT. I hope the mwe is correct, I'm a newbie of this

Answer Source

SOLVED

The problem as @cerpintax was saying, is a matter of different length: it is sufficient to condition the replacement in order to get it right.

Thank you very much @jason: your solution worked but I found a bug: when I used your code on the larger dataset, I got some NA instead of the replacement (don't know why).

Here's the working code, very simple! I just hate myself for spending so much time on this tiny bit..

```
repetitions <- function(x) {
x[x == lag(x) & !is.na(x) & !is.na(lag(x))] <- x[x == lag(x) & !is.na(x) & !is.na(lag(x))] + (0.0001*sd(x, na.rm = T))
x
}
ITA_HD6 <- data.frame(apply(ITA_HD5, 2, repetitions))
```