blazej - 9 months ago 75

R Question

I'm looking for a way to add a column to my data table that consists of

`residuals`

`lm(a~b)`

`c`

I've been suggested to look into

`sort_by(c)`

`lm(a~b)`

My working example data looks like this:

Columns subject, trial and rt are within a

`data.frame`

`Zre_SPSS`

`R`

I've tried

`data %<>% group_by (subject) %>%`

mutate(Zre=residuals(lm(log(rt)~trial)))

but it doesn't work - Zre gets computed but not within each subject separately, rather for the entire data frame.

Anyone could please help me? I'm a complete R (and coding in general) newbie, so please forgive me if this question is stupid or a duplicate, chances are I didn't understand other solutions or they where not solutions I looked for. Best regards.

As per Ben Bolker request here is R code to generate data from excel screen shot

`#generate data`

subject<-c(1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3)

subject<-factor(subject)

trial<-c(1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6)

rt<-c(300,305,290,315,320,320,350,355,330,365,370,370,560,565,570,575,560,570)

#Following variable is what I would get after using SPSS code

ZreSPSS<-c(0.4207,0.44871,-1.7779,0.47787,0.47958,-0.04897,0.45954,0.45487,-1.7962,0.43034,0.41075,0.0407,-0.6037,0.0113,0.61928,1.22038,-1.32533,0.07806)

#make data frame

sym<-data.frame(subject, trial, rt, ZreSPSS)

Answer

It looks like a bug in dplyr 0.5's `mutate`

, where `lm`

within a group will still try to use the full dataset. You can use `do`

instead:

```
sym %>% group_by(subject) %>% do(
{
r <- resid(lm(log(rt) ~ trial, data = .))
data.frame(., r)
})
```

This still doesn't match your SPSS column, but it's the correct result for the data you've given. You can verify this by fitting the model manually for each subject and checking the residuals.

(Other flavours of residuals include `rstandard`

for standardized and `rstudent`

for studentized residuals. They still don't match your SPSS numbers, but might be what you're looking for.)

Source (Stackoverflow)