Rilcon42 Rilcon42 - 3 months ago 16
R Question

counting lengths between alternating columns

I am trying to figure out how to count the number of rows from when one column says True to when the other column says True. I attempted to use run length encoding but couldnt figure out how to get the alternating values form each column.

set.seed(42)
s<-sample(c(0,1,2,3),500,replace=T)
isOverbought<-s==1
isOverSold<-s==0
head(cbind(isOverbought,isOverSold),20)
res<-rle(isOverSold)
tt<-res[res$values==0] #getting when Oversold is true

> head(cbind(isOverbought,isOverSold))

[1,] FALSE FALSE
[2,] FALSE FALSE
[3,] TRUE FALSE <-starting condition is overbought
[4,] FALSE FALSE
[5,] FALSE FALSE
[6,] FALSE FALSE
[7,] FALSE FALSE
[8,] FALSE TRUE <-is oversold. length from overbought to oversold = 5
[9,] FALSE FALSE
[10,] FALSE FALSE
[11,] TRUE FALSE <- is overbought. length from oversold to overbought = 3
[12,] FALSE FALSE
[13,] FALSE FALSE
[14,] TRUE FALSE
[15,] TRUE FALSE
[16,] FALSE FALSE
[17,] FALSE FALSE
[18,] FALSE TRUE <-is oversold. length from overbought to oversold = 7
[19,] TRUE FALSE <- is overbought. length from oversold to overbought = 1
[20,] FALSE FALSE


GOAL

overboughtTOoversold oversoldTOoverbought
5 3
7 1

Answer

This is sufficient to solve your problem.

## `a` to `b`
a2b <- function (a, b) {
  x <- which(a)    ## position of `TRUE` in `a`
  y <- which(b)    ## position of `TRUE` in `b`
  z <- which(a | b)   ## position of all `TRUE`
  end <- match(y, z)    ## match for end position
  start <- c(1L, end[-length(end)] + 1L)    ## start position
  valid <- end > start  ## remove cases with `end = start`
  z[end[valid]] - z[start[valid]]
  }

## cross `a` and `b`
axb <- function (a, b) {
  if (any(a & b))
    stop ("Invalid input! `a` and `b` can't have TRUE at the same time!")
  x <- a2b(a, b); y <- a2b(b, a)
  if (which(a)[1L] < which(b)[1L]) cbind(a2b = x, b2a = c(NA_integer_, y))
  else cbind(a2b = c(NA_integer_, x), b2a = y)
  }

For your isOverbought and isOverSold, we obtain:

result <- axb(isOverbought, isOverSold)

head(result)
#     a2b b2a
#[1,]   5  NA
#[2,]   7   3
#[3,]   3   1
#[4,]   8   5
#[5,]   2   6
#[6,]  10   2

Since isOverbought has the first TRUE before isOverSold, the first element of the 2nd column is NA.