User23 User23 - 3 months ago 7
R Question

Combine data frames in R unless entry already exists

I have a large data frame A that has sales figures for different items for only a few weeks and doesn't mention the weeks where no sales occurred. I therefore created a second data frame B where I have included all weeks with sales set to 0. I now want to add B to A but not for the weeks where A already mentions a sale. I was hoping to do this via an added combination variable but can't seem to figure out a fast way to do this.

So I have for example

A Week ID Sales Combination B Week ID Sales Combination
1 X 5 1_X 1 X 0 1_X
2 X 6 2_X 2 X 0 2_X
5 X 5 5_X 3 X 0 3_X
6 X 4 6_X 4 X 0 4_X
1 Y 2 1_Y 5 X 0 5_X
3 Y 2 3_Y 6 X 0 6_X
5 Y 2 5_Y 1 Y 0 1_Y
2 Y 0 2_Y
3 Y 0 3_Y
4 Y 0 4_Y
5 Y 0 5_Y


And what I want is this

Week ID Sales Combination
1 X 5 1_X
2 X 6 2_X
3 X 0 3_X
4 X 0 4_X
5 X 5 5_X
6 X 4 6_X
1 Y 2 1_Y
2 Y 0 2_Y
3 Y 2 3_Y
4 Y 0 4_Y
5 Y 2 5_Y


Hope this makes it more or less clear.

Answer

Let dfA be the first data.frame, and dfB be the second one, you could do

# Get relevant data together
new_df = rbind(dfA, dfB[dfA$Combination != dfB$Combination,])

# Order the data frame
sorting_index = sort(new_df$Combination, index.return=T)
new_df = new_df[sorting.index$ix,]

Alternatively, you could set our new data frame as being dfB and then use match to get the values from dfA and put them at the right place.