ZMacarozzi ZMacarozzi - 22 days ago 16
R Question

Find an element following a sequence across rows in a data frame

I have a data set with the structure shown below.

# example data set

a <- "a"
b <- "b"
d <- "d"

id1 <- c(a,a,a,a,b,b,d,d,a,a,d)
id2 <- c(b,d,d,d,a,a,a,a,b,b,d)
id3 <- c(b,d,d,a,a,a,a,d,b,d,d)

dat <- rbind(id1,id2,id3)
dat <- data.frame(dat)


I need to find across each row the first sequence with repeated elements "a" and identify the element following the sequence immediately.

# desired results

dat$s3 <- c("b","b","d")
dat


I was able to break the problem in 3 steps and solve the first one but as my programming skills are quite limited, I would appreciate any advice on how to approach steps 2 and 3. If you have an idea that solves the problem in another way that would be extremely helpful as well.

Here is what I have so far:

# Step 1: find the first occurence of "a" in the fist sequence
dat$s1 <- apply(dat, 1, function(x) match(a,x))

# Step 2: find the last occurence in the first sequence

# Step 3: find the element following the last occurence in the first sequence


Thanks in advance!

Answer

I'd use filter:

fun <- function(x) {
  x <- as.character(x)
  isa <- (x == "a") #find "a" values

  #find sequences with two TRUE values and the last value FALSE
  ids <- stats::filter(isa, c(1,1,1), sides = 1) == 2L & !isa

  na.omit(x[ids])[1] #subset     
}

apply(dat, 1, fun)
#id1 id2 id3 
#"b" "b" "d" 
Comments