L.R. L.R. - 3 months ago 9
R Question

Merge two data frames from a national survey with panel and not panel individuals of two different years (in r)

I tried to search on the website but I didn't find the answer to my question; if there is already one please write the link.

I have two data frames from a national survey: each year I have some families that have already been interviewed and others that are new. I want to merge the data frames in order to have only the families present in both data frames and match them in order to have the 2014 values in a row and the 2012 values in the next one for each individual (for the sake of semplicity I omitted other social variables present in the survey).

For example: df1 and df2

> df1 <- data.frame(nquest=c(173, 526, 1066, 1066), nord=c(1,1,1,2), year=c(2014, 2014, 2014, 2014))
> structure(df1)
nquest nord year
1 173 1 2014
2 526 1 2014
3 1066 1 2014
4 1066 2 2014

> df2 <- data.frame(nquest=c(173, 526, 3456, 3456), nord=c(1,1,1,2), year=c(2012, 2012, 2012, 2012))
> structure(df2)
nquest nord year
1 173 1 2012
2 526 1 2012
3 3456 1 2012
4 3456 2 2012


where nquest is the number of the family and nord the component of the family (ex. 1 father, 2 mother).

I want to merge them in this way:

> df <- data.frame(nquest=c(173, 173, 526,526), nord=c(1,1,1,1), year=c(2014, 2012, 2014, 2012))
> structure(df)
nquest nord year
1 173 1 2014
2 173 1 2012
3 526 1 2014
4 526 1 2012


I tried the to merge them:

tot <- merge (df1, df2, by=c("nquest", "nord")
structure(tot)
nquest nord year.x year.y
1 173 1 2014 2012
2 526 1 2014 2012


and I tried the rbind function:

> tot <- rbind(s, df2)
> structure(tot)
nquest nord year
1 173 1 201
2 526 1 2014
3 1066 1 2014
4 1066 2 2014
5 173 1 2012
6 526 1 2012
7 3456 1 2012
8 3456 2 2012


Thank you

Answer

This is an approach using "dplyr", there is probably a better way to do the filtering though

bind_rows(df1, df2) %>% 
  filter( nquest %in% df1$nquest & nquest %in% df2$nquest) %>%
  arrange(nquest, desc(year))

The second condition on the "arrange" function, that specifies year, is not necessary in this case but I am putting it there for completness