Mikael Rubin Mikael Rubin - 1 month ago 12
R Question

for loop with function that writes to 3 separate columns R or dplyr/reshape solution?

I'm a total beginner to for loops so I apologize if there's already a clear answer to this question but I wasn't able to find anything that I understood how to apply to this specific question. I also started to try a dplyr implementation at the end but couldn't figure that out either.

Here's my question: there's a function that derives 3 values from a vector. I'd like to write those 3 values to the same df as new columns. The function is

timefit
from the
retimes
library in R.
If I run it on the whole df:

a1 <- timefit(data$RT)
a1:
mu: 480.3346
sigma: 77.8531
tau: 376.7426


If I place the values into a df
df <- data.frame(a1@par)
:

a1.par
mu 480.33462
sigma 77.85305
tau 376.74257


I'd like to run it separately for each subID based on another variable "location" (a factor with two levels). So that I end up with something like

subID location mu sigma tau
1 0 500 50 400
1 0 500 50 400
1 1 376 50 410
1 1 376 50 410
2 0 400 60 400
2 0 400 60 400
2 1 410 60 410
2 1 410 60 410


I got started with

for (subID in data) {
timefit(data$RT)
}


But I know that's not going to actually do what I need it to do. Values are extracted from the timefit model with @par into long format so I need to specify the function timefit to write to 3 separate column headers? Any suggestions?

Also, I thought about using ddply, but that last line is tripping me up, because the format is long but I need it to be wide. I've messed with reshape a bit, but I'm having trouble figuring it out

data <- data %>%
group_by(subID, location) %>%
mutate(timefit_out = timefit(RT))


Thanks for your help!

Answer

You can use summarise instead of mutate here to generate a list-column containing a data.frame from each (subID, location)'s timefit. These data frames encode the mu, sigma, and tau from the result of timefit as columns. Then, use unnest to unnest this list-column to generate the result you want.

library(retimes)
library(dplyr)
library(tidyr)
result <- data %>% group_by(subID, location) %>%
                   summarise(timefit_out = list(data.frame(t(attr(timefit(RT),"par"))))) %>%
                   unnest()

Note that we extract the "par" attribute from the timefit class and then transpose it with t to form columns for mu, sigma, and tau.

Here, we assume that your input data is a data frame with columns subID, location, and the numeric column of reaction times RT that is input to timefit. A simulated example of such a dataset is given by:

data <- structure(list(subID = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), 
location = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), 
RT = c(0.341764254728332, 0.775535081513226, 0.281827432336286, 
0.23970171622932, 0.00226009078323841, 0.385179498931393, 
0.645917195128277, 0.812101020244882, 0.183301427634433, 
0.981765420176089, 0.656369511503726, 0.824469136772677, 
0.923240559641272, 0.598261737963185, 0.309975759591907, 
0.778991278028116, 0.757012664806098, 0.869985132943839, 
0.439378245733678, 0.8420404586941, 0.643788777757436, 0.381316626211628, 
0.123881611274555, 0.540528740268201, 0.661961955949664, 
0.0592848095111549, 0.904047027230263, 0.190083365887403, 
0.963809312786907, 0.0925120878964663, 0.117538752267137, 
0.451085010776296, 0.703220259631053, 0.378451474476606, 
0.305718191433698, 0.70383172808215, 0.699415655340999, 0.740436099236831, 
0.429179352009669, 0.205358384409919)), .Names = c("subID", 
"location", "RT"), row.names = c(NA, 40L), class = "data.frame")
##   subID location          RT
##1      1        0 0.341764255
##2      1        0 0.775535082
##3      1        0 0.281827432
##4      1        0 0.239701716
##5      1        0 0.002260091
##6      1        0 0.385179499
##7      1        0 0.645917195
##8      1        0 0.812101020
##9      1        0 0.183301428
##10     1        0 0.981765420
##11     1        1 0.656369512
##12     1        1 0.824469137
##13     1        1 0.923240560
##14     1        1 0.598261738
##15     1        1 0.309975760
##16     1        1 0.778991278
##17     1        1 0.757012665
##18     1        1 0.869985133
##19     1        1 0.439378246
##20     1        1 0.842040459
##21     2        0 0.643788778
##22     2        0 0.381316626
##23     2        0 0.123881611
##24     2        0 0.540528740
##25     2        0 0.661961956
##26     2        0 0.059284810
##27     2        0 0.904047027
##28     2        0 0.190083366
##29     2        0 0.963809313
##30     2        0 0.092512088
##31     2        1 0.117538752
##32     2        1 0.451085011
##33     2        1 0.703220260
##34     2        1 0.378451474
##35     2        1 0.305718191
##36     2        1 0.703831728
##37     2        1 0.699415655
##38     2        1 0.740436099
##39     2        1 0.429179352
##40     2        1 0.205358384

The values for RT in this example are generated using runif so they are between 0 and 1. Your values are much different, but that should not matter here.

Using this data, we get:

print(result)
##Source: local data frame [4 x 5]
##Groups: subID [2]
##
##  subID location        mu     sigma         tau
##  <int>    <int>     <dbl>     <dbl>       <dbl>
##1     1        0 0.5275058 0.2553621 0.007086207
##2     1        1 0.2609386 0.1583494 0.085449559
##3     2        0 0.5205647 0.1994942 0.027329115
##4     2        1 0.4632886 0.2881343 0.008026460
Comments