Suchit Kumbhare Suchit Kumbhare - 3 months ago 16
R Question

Extracting the slope for individual observation

I'm a newbie in R. I have a data set with 3 set of lung function measurements for 3 corresponding dates given below for each observation. I would like to extract slope for each observation(decline in lung function) using R software and insert in the new column for each observation.

1. How should I approach the problem?

2. Is my data set arranged in right format?

ID FEV1_Date11 FEV1_Date12 FEV1_Date13 DATE11 DATE12 DATE13
18105 1.35 1.25 1.04 6/9/1990 8/16/1991 8/27/1993
18200 0.87 0.85 9/12/1991 3/11/1993
18303 0.79 4/23/1992
24204 4.05 3.95 3.99 6/8/1992 3/22/1993 11/5/1994
28102 1.19 1.04 0.96 10/31/1990 7/24/1991 6/27/1992
34104 1.03 1.16 1.15 7/25/1992 12/8/1993 12/7/1994
43108 0.92 0.83 0.79 6/23/1993 1/12/1994 1/11/1995
103114 2.43 2.28 2.16 6/5/1994 6/21/1995 4/7/1996
114101 0.73 0.59 0.6 6/25/1989 8/5/1990 8/24/1991


example for 1st observation, slope=0.0003
enter image description here
Thanks..

Answer

If I understood the question, I think you want the slope between each set of visits:

library(dplyr)

group_by(df, ID) %>% 
  mutate_at(vars(starts_with("DATE")), funs(as.Date(., "%m/%d/%Y"))) %>% 
  do(data_frame(slope=diff(unlist(.[,2:4]))/diff(unlist(.[,5:7])),
                after_visit=1+(1:length(slope))))

## Source: local data frame [18 x 3]
## Groups: ID [9]
## 
##        ID         slope after_visit
##     <int>         <dbl>       <dbl>
## 1   18105 -2.309469e-04           2
## 2   18105 -2.830189e-04           3
## 3   18200 -3.663004e-05           2
## 4   18200            NA           3
## 5   18303            NA           2
## 6   18303            NA           3
## 7   24204 -3.484321e-04           2
## 8   24204  6.745363e-05           3
## 9   28102 -5.639098e-04           2
## 10  28102 -2.359882e-04           3
## 11  34104  2.594810e-04           2
## 12  34104 -2.747253e-05           3
## 13  43108 -4.433498e-04           2
## 14  43108 -1.098901e-04           3
## 15 103114 -3.937008e-04           2
## 16 103114 -4.123711e-04           3
## 17 114101 -3.448276e-04           2
## 18 114101  2.604167e-05           3

Alternate munging:

group_by(df, ID) %>% 
  mutate_at(vars(starts_with("DATE")), funs(as.Date(., "%m/%d/%Y"))) %>% 
  do(data_frame(date=as.Date(unlist(.[,5:7]), origin="1970-01-01"), # in the event you wanted to keep the data less awful and have one observation per row, this preserves the Date class
                reading=unlist(.[,2:4]))) %>% 
  do(data_frame(slope=diff(.$reading)/unclass(diff(.$date))))