Brett Brett - 3 months ago 14
R Question

Using lapply to perform multiple operations on many elements of a list in R

I currently have a list of 150 elements that each contain 5 columns and varying length from 200-1000 rows a piece. I want to perform a split inside each element of my list on the data frame. I essentially want to make a new list of the same length but with very different data frames inside the list. I know what I want to do on each element but cannot seem to find the correct method to implement this over the entire list. An example list is below:

>ex

$`66th & Center`

Bike CheckoutKioskName ReturnKioskName Checkout_date_time Return_date_time UserRole
24583 191 66th & Center 66th & Center 2013-02-28 15:08:58 2013-02-28 15:09:08 Maintenance
24584 191 66th & Center 66th & Center 2013-02-28 15:09:30 2013-02-28 15:09:54 Maintenance
24585 191 66th & Center 66th & Center 2013-02-28 15:09:51 2013-02-28 15:10:11 Maintenance
24586 191 66th & Center 66th & Center 2013-02-28 15:10:09 2013-02-28 15:10:25 Maintenance
24587 191 66th & Center 66th & Center 2013-02-28 15:10:24 2013-02-28 15:10:47 Maintenance
24588 191 66th & Center 66th & Center 2013-02-28 15:10:49 2013-02-28 15:11:16 Maintenance

$`67th & Frances`
Bike CheckoutKioskName ReturnKioskName Checkout_date_time Return_date_time UserRole
24598 173 67th & Frances 67th & Frances 2013-02-28 16:39:27 2013-02-28 16:39:27 Maintenance
24599 230 67th & Frances 67th & Frances 2013-02-28 16:39:43 2013-02-28 16:39:43 Maintenance
24600 279 67th & Frances 67th & Frances 2013-02-28 16:40:22 2013-02-28 16:40:22 Subscriber
24616 102 67th & Frances 67th & Frances 2013-03-09 13:38:20 2013-03-09 18:41:42 Subscriber
24617 59 67th & Frances 67th & Frances 2013-03-09 13:39:09 2013-03-09 18:41:41 Subscriber
24619 279 67th & Frances 67th & Frances 2013-03-12 15:03:56 2013-03-12 15:04:53 Member

$`67th & Pine`
Bike CheckoutKioskName ReturnKioskName Checkout_date_time Return_date_time UserRole
24601 258 67th & Pine 67th & Pine 2013-02-28 16:57:08 2013-02-28 21:40:22 Maintenance
24602 258 67th & Pine Aksarben Drive 2013-03-01 15:34:21 2013-03-01 20:36:37 Maintenance
24603 261 67th & Pine Aksarben Drive 2013-03-01 15:34:25 2013-03-01 20:36:50 Maintenance
24622 279 67th & Frances 67th & Pine 2013-03-12 17:23:16 2013-03-12 17:27:03 Subscriber
24623 59 67th & Frances 67th & Pine 2013-03-12 17:23:29 2013-03-12 18:53:52 Member
24624 116 Aksarben Drive 67th & Pine 2013-03-12 17:38:05 2013-03-12 18:51:46 Member


An example of what I want to do is below. I have just taken one element from the list to do my initial testing on:

tes <- ex$`66th & Center`

c.tes <- tes[tes$CheckoutKioskName == '66th & Center',c('CheckoutKioskName','Checkout_date_time')]
c.tes$event <- rep(-1,length(c.tes))
names(c.tes) <- c('Station','Time','Event')
r.tes <- tes[tes$ReturnKioskName == '66th & Center', c('ReturnKioskName','Return_date_time')]
r.tes$event <- rep(1,length(r.tes))
names(r.tes) <- c('Station','Time','Event')
c.r.tes <- rbind(c.tes,r.tes)
c.r.tes <- c.r.tes[with(c.r.tes,order(Time)),]
c.r.tes$Tlapsed <- c(NA,c.r.tes[2:nrow(c.r.tes),c('Time')] - c.r.tes[-nrow(c.r.tes),c('Time')])


Which returns:

c.r.tes
Station Time Event Tlapsed
24583 66th & Center 2013-02-28 15:08:58 -1 NA
245831 66th & Center 2013-02-28 15:09:08 1 10
24584 66th & Center 2013-02-28 15:09:30 -1 22
24585 66th & Center 2013-02-28 15:09:51 -1 21
245841 66th & Center 2013-02-28 15:09:54 1 3
24586 66th & Center 2013-02-28 15:10:09 -1 15
245851 66th & Center 2013-02-28 15:10:11 1 2
24587 66th & Center 2013-02-28 15:10:24 -1 13
245861 66th & Center 2013-02-28 15:10:25 1 1
245871 66th & Center 2013-02-28 15:10:47 1 22
24588 66th & Center 2013-02-28 15:10:49 -1 2
245881 66th & Center 2013-02-28 15:11:16 1 27


I want to do this exact same process but for every element of the list. I would like my final output to be something like
ex.events
which would contain 150 elements all of which would have a data.frame in the same format as my
tes
example.

I have attempted to do this myself using
lapply
which I believe to be the most efficient way, but I cannot seem to get the errors to stop coming. Here is the syntax I have tried:

setNames(lapply(us, function(e){
c.e <- ex$e[ex$e$CheckoutKioskName == e ,c('CheckoutKioskName','Checkout_date_time')]
c.e$event <- rep(-1,length(c.e))
names(c.e) <- c('Station','Time','Event')
r.e <- ex$e[ex$e$ReturnKioskName == e , c('ReturnKioskName','Return_date_time')]
r.e$event <- rep(1,length(r.e))
names(r.e) <- c('Station','Time','Event')
c.r.e <- rbind(c.e,r.e)
c.r.e <- c.r.e[with(c.r.e,order(Time)),]
c.r.e$Tlapsed <- c(NA,c.r.e[2:nrow(c.r.e),c('Time')] - c.r.e[-nrow(c.r.e),c('Time')])
}),us)


I again just want the end result to be the same length of a list I started with but with each element having the code done on it.

I have really been struggling with this so I appreciate any help I can get.

Thank you in advance.

Answer

This is not a full answer because a full answer will require a dput of your input data and description of what us is. However, it should give you some hints at your problem. Let's assume that your data is:

ex <- list(`66th & Center`=data.frame(CheckoutKioskName=c(1,2), ReturnKioskName=c(3,4)), `67th & Frances`=data.frame(CheckoutKioskName=c(5,6), ReturnKioskName=c(7,8)))

and us is (Note that no back-quotes are used):

us <- c("66th & Center","67th & Frances")

Then,

lapply(us, function(e) print(ex$e$CheckoutKioskName))
##NULL
##NULL
##[[1]]
##NULL
##
##[[2]]
##NULL

results in NULLs. However:

lapply(us, function(e) print(ex[[e]]$CheckoutKioskName))
##[1] 1 2
##[1] 5 6
##[[1]]
##[1] 1 2
##
##[[2]]
##[1] 5 6

gives us what we want.

Comments