ZMacarozzi - 6 months ago 37

R Question

I have a data set with the structure shown below.

`# example data set`

a <- "a"

b <- "b"

d <- "d"

id1 <- c(a,a,a,a,b,b,d,d,a,a,d)

id2 <- c(b,d,d,d,a,a,a,a,b,b,d)

id3 <- c(b,d,d,a,a,a,a,d,b,d,d)

dat <- rbind(id1,id2,id3)

dat <- data.frame(dat)

I need to find across each row the

`# desired results`

dat$s3 <- c("b","b","d")

dat

I was able to break the problem in 3 steps and solve the first one but as my programming skills are quite limited, I would appreciate any advice on how to approach steps 2 and 3. If you have an idea that solves the problem in another way that would be extremely helpful as well.

Here is what I have so far:

`# Step 1: find the first occurence of "a" in the fist sequence`

dat$s1 <- apply(dat, 1, function(x) match(a,x))

# Step 2: find the last occurence in the first sequence

# Step 3: find the element following the last occurence in the first sequence

Thanks in advance!

Answer

I'd use `filter`

:

```
fun <- function(x) {
x <- as.character(x)
isa <- (x == "a") #find "a" values
#find sequences with two TRUE values and the last value FALSE
ids <- stats::filter(isa, c(1,1,1), sides = 1) == 2L & !isa
na.omit(x[ids])[1] #subset
}
apply(dat, 1, fun)
#id1 id2 id3
#"b" "b" "d"
```