Milhouse Milhouse - 4 months ago 15
R Question

Dropping variables in r with long data

I'm working with longitudinal data in long format, and I'm for the purposes of what I want to do I'm essentially trying to transform it into a panel dataset. To give an idea of what I have at the moment:

ID CYRB VAR VALUE
1 1983 ATTEN98 1
1 1983 ATTEN00 1
1 1983 ATTEN02 0
1 1983 ATTEN04 0
2 1979 ATTEN98 1
2 1979 ATTEN00 0
2 1979 ATTEN02 0
2 1979 ATTEN04 0
....


Where ATTENXX is a dummy variable denoting whether individual i was attending school in the year of interview. My plan is to only keep the variable for the interview corresponding to when the respondent was either 19 or 20. e.g. for an individual born in 1983 this would mean keeping only the ATTEN02 variable. I've been trying to do it with a combination of filter (from dplyr) and if else but I just can't get the syntax right and usually end up with an error.

Answer

Maybe something like this:

dat %>% 
  mutate(varnum = as.numeric(substr(VAR,6,7)),
         varnum = ifelse(varnum<50, varnum + 2000, varnum + 1900)) %>%
  filter((varnum - CYRB) %in% 19:20) %>%
  select(-varnum)
  ID CYRB     VAR VALUE
1  1 1983 ATTEN02     0
2  2 1979 ATTEN98     1
Comments