hfisch - 1 year ago 56
R Question

# Assigning a value to each range of consecutive numbers with same sign in R

I'm trying to create a data frame where a column exists that holds values representing the length of runs of positive and negative numbers, like so:

``````Time  V  Length
0.5  -2  1.5
1.0  -1  1.5
1.5   0  0.0
2.0   2  1.0
2.5   0  0.0
3.0   1  1.75
3.5   2  1.75
4.0   1  1.75
4.5  -1  0.75
5.0  -3  0.75
``````

The
`Length`
column sums the length of time that the value has been positive or negative. Zeros are given a
`0`
since they are an inflection point. If there is no zero separating the sign change, the values are averaged on either side of the inflection.

I am trying to approximate the amount of time that these values are spending either positive or negative. I've tried this with a
`for`
loop with varying degrees of success, but I would like to avoid looping because I am working with extremely large data sets.

I've spent some time looking at
`sign`
and
`diff`
as they are used in this question about sign changes. I've also looked at this question that uses
`transform`
and
`aggregate`
to sum consecutive duplicate values. I feel like I could use this in combination with
`sign`
and/or
`diff`
, but I'm not sure how to retroactively assign these sums to the ranges that created them or how to deal with spots where I'm taking the average across the inflection.

Any suggestions would be appreciated. Here is the sample dataset:

``````dat <- data.frame(Time = seq(0.5, 5, 0.5), V = c(-2, -1, 0, 2, 0, 1, 2, 1, -1, -3))
``````

First find indices of "Time" which need to be interpolated: i.e. consecutive "V" which lack a zero between positive and negative values; they have an `abs(diff(sign(V))` larger than one.

``````id <- which(abs(c(0, diff(sign(dat\$V)))) > 1)
``````

To the original data, add rows of V = zero at Time = 0 and at last time step (according to the assumptions mentioned by @Gregor), and add mean of "Time" at relevant indices and corresponding "V" values of zero. Order by "Time".

``````d2 <- rbind(dat,
data.frame(Time = c(0, max(dat\$Time)), V = c(0, 0)),
data.frame(Time = (dat\$Time[id] + dat\$Time[id - 1])/2, V = 0))
d2 <- d2[order(d2\$Time), ]
``````

Calculate time differences between time steps which are zero and replicate them using "zero-group indices".

``````d2\$Length <- diff(d2\$Time[d2\$V == 0])[cumsum(d2\$V == 0)]
``````

Add values to original data:

``````merge(dat, d2)

#    Time  V Length
# 1   0.5 -2   1.50
# 2   1.0 -1   1.50
# 3   1.5  0   1.00
# 4   2.0  2   1.00
# 5   2.5  0   1.75
# 6   3.0  1   1.75
# 7   3.5  2   1.75
# 8   4.0  1   1.75
# 9   4.5 -1   0.75
# 10  5.0 -3   0.75
``````

Set "Length" to `0` where `V == 0`.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download