Ben Carlson Ben Carlson - 1 month ago 17
R Question

Expand data.frame rows by criteria

I'd like to know if it's possible to use dplyr to expand the rows of a data.frame based on criteria in each row. If it's not possible in dplyr I'd be happy for any solution!

Here is a sample of my data

data.frame(plot=rep(c(6,7),each=4),
trans=rep(c("0,0","0,100","100,100","100,0"),2),
length_m=c(350,200,200,50,45,200,125,75) )

plot trans length_m
6 0,0 350
6 0,100 200
6 100,100 200
6 100,0 50
7 0,0 45
7 0,100 200
7 100,100 125
7 100,0 75


The data above represent two plots. In general each of my plots plot has 1 to 4 transects, identified by 0,0; 0,100; 100,100; or 100,0 (the plots above both have all four possible transects). Each transect has a length given by length_m. What I'd like to do is to divide each transect by length L, and make one row for each new transect. If the final transect is < L, then that distance should be added to the previous transect.

So, if L = 100, the above dataset would look like this

plot trans length_m
6 0,0_0 100
6 0,0_100 100
6 0,0_200 150
6 0,100_0 100
6 0,100_100 100
6 100,100_0 100
6 100,100_100 100
6 100,0_0 50
7 0,0_0 45
7 0_100_0 100
7 0,100_100 100
7 100,100_0 125
7 100,0_0 75


Note that 6, 0,0, which was 350 meters long, is split into sections 0,100 & 200, with lengths 100,100 & 150, while 6, 100,0 which was 50 meters long is just a single section 0 and is still 50 meters long.

I've tried a couple of different ways to make this work but nothing that is worth posting, so any help would be much appreciated!

Answer

Here's a data table solution, assuming your original data is in a data frame df.

df$trans <- as.character(df$trans)   # need trans to be char, not factor
library(data.table)
dt <- data.table(df)         
L <- 100
f <- function(x) {                   # implements the partitioning
  if (x<L) return(x)
  y <- rep(L,as.integer(x/L))
  y[length(y)] <- y[length(y)]+x-sum(y)
  return(y)
}
result <- dt[,list(length_m=f(length_m)),by=list(plot,trans)]
result[,trans:=paste(trans,L*(0:(.N-1)),sep="_"),by=list(plot,trans)]
result
#     plot       trans length_m
#  1:    6       0,0_0      100
#  2:    6     0,0_100      100
#  3:    6     0,0_200      150
#  4:    6     0,100_0      100
#  5:    6   0,100_100      100
#  6:    6   100,100_0      100
#  7:    6 100,100_100      100
#  8:    6     100,0_0       50
#  9:    7       0,0_0       45
# 10:    7     0,100_0      100
# 11:    7   0,100_100      100
# 12:    7   100,100_0      125
# 13:    7     100,0_0       75