Edu - 2 months ago 6

R Question

I have a data frame with a similar longitudinal structure as

`data`

`data = data.frame (`

ID = c("a","a","a","b","b","b","c","c", "c"),

period = c(1,2,3,1,2,3,1,2,3),

size = c(3,3,NA, NA, NA,1, 14,14, 14))

The values of the variable

`size`

`size`

with the value of

`size`

`ID`

`ID`

The desired data frame should look something similar to:

`data.1`

ID period value

a 1 3

a 2 3

a 3 3

b 1 1

b 2 1

b 3 1

c 1 14

c 2 14

c 3 14

I have tried different combinations of the formula below but I don't get the result I am looking for.

`library(dplyr)`

data.1 = data %>% group_by(ID) %>%

mutate(new.size = ifelse(is.na(size), !is.na(size),

ifelse(!is.na(size), size, 0)))

That yields the following:

`data.1`

Source: local data frame [9 x 4]

Groups: ID [3]

ID period size new.size

(fctr) (dbl) (dbl) (dbl)

1 a 1 3 3

2 a 2 3 3

3 a 3 NA 0

4 b 1 NA 0

5 b 2 NA 0

6 b 3 1 1

7 c 1 14 14

8 c 2 14 14

9 c 3 14 14

I would be grateful if someone could give me a hint on how to get the right solution.

Answer

here another solution using `dplyr`

with `na.omit`

```
group_by(data, ID) %>%
mutate(value=na.omit(size)[1])
Source: local data frame [9 x 4]
Groups: ID [3]
ID period size value
<fctr> <dbl> <dbl> <dbl>
1 a 1 3 3
2 a 2 3 3
3 a 3 NA 3
4 b 1 NA 1
5 b 2 NA 1
6 b 3 1 1
7 c 1 14 14
8 c 2 14 14
9 c 3 14 14
```

note that you can replace `na.omit`

with `max(size, na.rm=TRUE)`

if you are looking for maximum for example.

Source (Stackoverflow)

Comments