Sol Sol - 11 months ago 41
R Question

Find the first and last index of a group/zone of identical character strings in a vector

b <- c("true", "true", "true", "true", "true", "false", "false", "true","true", "true", "false", "false", "false","true", "true", "false", "true", "false", "true", "false")

I'm trying to write a function that takes the above vector as an input and finds the indices of the first and last occurrences of the desired string (e.g. "true") in every 'zone' (zone being defined as a subvector where there are two or more consecutive identical elements). The desired output for the above would be a data-frame such as:

x | y
1 | 5
8 | 10
14 | 15

I have successfully written a function (below) that does this, but takes far too long for my Shiny app. Would be great if there was a cleaner and faster way of doing this.

zone_identifier <- function(dataframe, zone_source_col_index, match_string){
zones_df <- data.frame()
zone_source_vector <- data.frame[,zone_source_col_index]

for(i in 1:(length(zone_source_vector)-1){
zone_component_recorder <-vector()
for(j in 1:(length(zone_source_vector)-i)){
if(zone_source_vector[i]==match_string && zone_source_vector[i+j]==match_string){ if(i>1 && zone_source_vector[i-1]==match_string{

zone_component_recorder <-c(i, i+j)
else if(zone_source_vector[i]==match_string && zone_source_vector[i+j]!=match_string){break}
zones_df <-, zone_component_recorder)}

Answer Source

You can use rle to find the solution

#use rle to find runs of same value in b
#find starting position of each true and false
#same for end position

#filter on true values
#   x  y
#1  1  5
#2  8 10
#3 14 15
#4 17 17
#5 19 19