hfisch - 7 months ago 25

R Question

I'm trying to create a data frame where a column exists that holds values representing the length of runs of positive and negative numbers, like so:

`Time V Length`

0.5 -2 1.5

1.0 -1 1.5

1.5 0 0.0

2.0 2 1.0

2.5 0 0.0

3.0 1 1.75

3.5 2 1.75

4.0 1 1.75

4.5 -1 0.75

5.0 -3 0.75

The

`Length`

`0`

I am trying to approximate the amount of time that these values are spending either positive or negative. I've tried this with a

`for`

I've spent some time looking at

`sign`

`diff`

`transform`

`aggregate`

`sign`

`diff`

Any suggestions would be appreciated. Here is the sample dataset:

`dat <- data.frame(Time = seq(0.5, 5, 0.5), V = c(-2, -1, 0, 2, 0, 1, 2, 1, -1, -3))`

Answer

First find indices of "Time" which need to be interpolated: i.e. consecutive "V" which lack a zero between positive and negative values; they have an `abs(diff(sign(V))`

larger than one.

```
id <- which(abs(c(0, diff(sign(dat$V)))) > 1)
```

To the original data, add rows of V = zero at Time = 0 and at last time step (according to the assumptions mentioned by @Gregor), and add mean of "Time" at relevant indices and corresponding "V" values of zero. Order by "Time".

```
d2 <- rbind(dat,
data.frame(Time = c(0, max(dat$Time)), V = c(0, 0)),
data.frame(Time = (dat$Time[id] + dat$Time[id - 1])/2, V = 0))
d2 <- d2[order(d2$Time), ]
```

Calculate time differences between time steps which are zero and replicate them using "zero-group indices".

```
d2$Length <- diff(d2$Time[d2$V == 0])[cumsum(d2$V == 0)]
```

Add values to original data:

```
merge(dat, d2)
# Time V Length
# 1 0.5 -2 1.50
# 2 1.0 -1 1.50
# 3 1.5 0 1.00
# 4 2.0 2 1.00
# 5 2.5 0 1.75
# 6 3.0 1 1.75
# 7 3.5 2 1.75
# 8 4.0 1 1.75
# 9 4.5 -1 0.75
# 10 5.0 -3 0.75
```

Set "Length" to `0`

where `V == 0`

.

Source (Stackoverflow)