waealu waealu - 2 days ago 3
R Question

R - Carry Forward Values

Here's my starting dataset:

> data
ID Record Value
A 1 100
A 3 200
A 4 300
B 1 800


For each ID, I want a record for each number 1 through 4. If a record is not available, create it using the most recent record.

The final dataset should look like this:

> newdata
ID Updt_Record Value
A 1 100
A 2 100
A 3 200
A 4 300
B 1 800
B 2 800
B 3 800
B 4 800


To do this, I am currently using
dplyr
:

library(dplyr)

data1 <- data %>% group_by(ID) %>% filter(Record <= 1) %>% filter(Record == max(Record)) %>% mutate(Updt_Record = 1)
data2 <- data %>% group_by(ID) %>% filter(Record <= 2) %>% filter(Record == max(Record)) %>% mutate(Updt_Record = 2)
data3 <- data %>% group_by(ID) %>% filter(Record <= 3) %>% filter(Record == max(Record)) %>% mutate(Updt_Record = 3)
data4 <- data %>% group_by(ID) %>% filter(Record <= 4) %>% filter(Record == max(Record)) %>% mutate(Updt_Record = 4)

newdata <- data1 %>%
bind_rows(data2) %>% bind_rows(data3) %>% bind_rows(data4) %>%
arrange(ID, Record) %>%
select(ID, Updt_Record, Value)


Is there a more efficient way of doing this? Thanks!

Answer
library(tidyr)
library(dplyr)
data %>%
  mutate(Record=factor(Record, 1:4)) %>%
  complete(ID, Record) %>%
  fill(Value) %>%
  mutate(Record=as.character(as.numeric(Record)))

# # A tibble: 8 x 3
#       ID Record Value
#   <fctr>  <dbl> <int>
# 1      A      1   100
# 2      A      2   100
# 3      A      3   200
# 4      A      4   300
# 5      B      1   800
# 6      B      2   800
# 7      B      3   800
# 8      B      4   800

Data

data <- structure(list(ID = structure(c(1L, 1L, 1L, 2L), .Label = c("A", 
"B"), class = "factor"), Record = c(1L, 3L, 4L, 1L), Value = c(100L, 
200L, 300L, 800L)), .Names = c("ID", "Record", "Value"), class = "data.frame", row.names = c(NA, 
-4L))