Ben Carlson - 3 years ago 96
R Question

# Expand data.frame rows by criteria

I'd like to know if it's possible to use dplyr to expand the rows of a data.frame based on criteria in each row. If it's not possible in dplyr I'd be happy for any solution!

Here is a sample of my data

``````data.frame(plot=rep(c(6,7),each=4),
trans=rep(c("0,0","0,100","100,100","100,0"),2),
length_m=c(350,200,200,50,45,200,125,75)        )

plot   trans length_m
6     0,0      350
6   0,100      200
6 100,100      200
6   100,0       50
7     0,0       45
7   0,100      200
7 100,100      125
7   100,0       75
``````

The data above represent two plots. In general each of my plots plot has 1 to 4 transects, identified by 0,0; 0,100; 100,100; or 100,0 (the plots above both have all four possible transects). Each transect has a length given by length_m. What I'd like to do is to divide each transect by length L, and make one row for each new transect. If the final transect is < L, then that distance should be added to the previous transect.

So, if L = 100, the above dataset would look like this

``````plot    trans length_m
6       0,0_0      100
6     0,0_100      100
6     0,0_200      150
6     0,100_0      100
6   0,100_100      100
6   100,100_0      100
6 100,100_100      100
6     100,0_0       50
7       0,0_0       45
7     0_100_0      100
7   0,100_100      100
7   100,100_0      125
7     100,0_0       75
``````

Note that 6, 0,0, which was 350 meters long, is split into sections 0,100 & 200, with lengths 100,100 & 150, while 6, 100,0 which was 50 meters long is just a single section 0 and is still 50 meters long.

I've tried a couple of different ways to make this work but nothing that is worth posting, so any help would be much appreciated!

Here's a data table solution, assuming your original data is in a data frame `df`.

``````df\$trans <- as.character(df\$trans)   # need trans to be char, not factor
library(data.table)
dt <- data.table(df)
L <- 100
f <- function(x) {                   # implements the partitioning
if (x<L) return(x)
y <- rep(L,as.integer(x/L))
y[length(y)] <- y[length(y)]+x-sum(y)
return(y)
}
result <- dt[,list(length_m=f(length_m)),by=list(plot,trans)]
result[,trans:=paste(trans,L*(0:(.N-1)),sep="_"),by=list(plot,trans)]
result
#     plot       trans length_m
#  1:    6       0,0_0      100
#  2:    6     0,0_100      100
#  3:    6     0,0_200      150
#  4:    6     0,100_0      100
#  5:    6   0,100_100      100
#  6:    6   100,100_0      100
#  7:    6 100,100_100      100
#  8:    6     100,0_0       50
#  9:    7       0,0_0       45
# 10:    7     0,100_0      100
# 11:    7   0,100_100      100
# 12:    7   100,100_0      125
# 13:    7     100,0_0       75
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download