Squeezie Squeezie - 2 months ago 14
R Question

Join 2 dataframes, while keeping exact left table

I want to join two tables (timeseries) while keeping every left entry (and NOT adding ones from the right) and adding things from the right only when they exist left and right.
Kind of a left inner join (which probably does not exist).

For example I wanna join A with B and want to get AB:

A=
1
2
3
4

B=
1 X
2 Y
2 Z
4 Z
5 ZZ

AB=
1 X
2 NaN
3 NaN
4 Z


2 and 3 are NaN because 2 exists multiple times and 3 does not exist on the right side

Thank you for the help

Answer

You mention 'time-series' data in your question. That type of data can come in a general table format or a specific R ts object. The ts class is a special type of vector or matrix given special properties to assist in time-series analysis. We will assume for this approach that you have the latter case, an object of class(A) [1] "data.frame".

This is less a merge than a conditional match. We can test if there is one and only one match of the A values in B:

indx <- sapply(A$col1, function(x) sum(B$col1 %in% x) == 1L)
data.frame(v1=A$col1, v2=ifelse(indx, B$col2, NaN), stringsAsFactors=FALSE)
#   v1  v2
# 1  1   X
# 2  2 NaN
# 3  3 NaN
# 4  4   Z

Note: Be sure to not fall into the factor trap. R automatically codes character values into class factor not into class character as you might expect at first. To avoid this issue use stringsAsFactors=FALSE,

#Data
A <- data.frame(col1=1:4L)
B <- data.frame(col1=c(1,2,2,4), col2=c("X", "Y", "Z", "Z"), stringsAsFactors=FALSE)
Comments