Edu - 1 year ago 68
R Question

# Recoding missing data in longitudinal data frames with R

I have a data frame with a similar longitudinal structure as

`data`
:

``````data = data.frame (
ID = c("a","a","a","b","b","b","c","c", "c"),
period = c(1,2,3,1,2,3,1,2,3),
size = c(3,3,NA, NA, NA,1, 14,14, 14))
``````

The values of the variable
`size`
are fixed so that each period has the same value for
`size`
. Yet some observations have missing values. My aim consists of replacing these missing values
with the value of
`size`
associated with the periods where there is no missing (e.g. 3 for
`ID`
"a" and 1 for
`ID`
"b").

The desired data frame should look something similar to:

``````data.1

ID period value
a      1     3
a      2     3
a      3     3
b      1     1
b      2     1
b      3     1
c      1    14
c      2    14
c      3    14
``````

I have tried different combinations of the formula below but I don't get the result I am looking for.

``````library(dplyr)

data.1 = data %>% group_by(ID) %>%
mutate(new.size = ifelse(is.na(size), !is.na(size),
ifelse(!is.na(size), size, 0)))
``````

That yields the following:

``````data.1
Source: local data frame [9 x 4]
Groups: ID [3]

ID period  size new.size
(fctr)  (dbl) (dbl)    (dbl)
1      a      1     3        3
2      a      2     3        3
3      a      3    NA        0
4      b      1    NA        0
5      b      2    NA        0
6      b      3     1        1
7      c      1    14       14
8      c      2    14       14
9      c      3    14       14
``````

I would be grateful if someone could give me a hint on how to get the right solution.

here another solution using `dplyr` with `na.omit`

``````group_by(data, ID) %>%
mutate(value=na.omit(size)[1])
Source: local data frame [9 x 4]
Groups: ID [3]

ID period  size value
<fctr>  <dbl> <dbl> <dbl>
1      a      1     3     3
2      a      2     3     3
3      a      3    NA     3
4      b      1    NA     1
5      b      2    NA     1
6      b      3     1     1
7      c      1    14    14
8      c      2    14    14
9      c      3    14    14
``````

note that you can replace `na.omit` with `max(size, na.rm=TRUE)` if you are looking for maximum for example.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download