Jake Jake - 4 months ago 7
R Question

Merge by column with multiple rows

I have to separate data frame with a common column named col_id.

My problem is the simple merge is not the ideal for my case.

Here is an example structure of df1 col_id

col_id,stock, ch2
id_100,stock 2, yes
id_100002,stock 2, no
id_100003,stock 2, no


And for the second df

col_id,num, cat1
id_100,num 2, 0
id_100,num 2, 1
id_100,num 2, 0
id_100002,num 2, 1
id_100002,num 2, 1
id_100002,num 2, 1
id_100003,num 2, 1
id_100003,num 2, 1


The output i would like to have is to fill all cells of the second df with the same value of df one. Example of the output

col_id,num, cat1,stock, ch2
id_100,num 2, 0,stock 2, yes
id_100,num 2, 1,stock 2, yes
id_100,num 2, 0,stock 2, yes
id_100002,num 2, 1,stock 2, no
id_100002,num 2, 1,stock 2, no
id_100002,num 2, 1,stock 2, no
id_100003,num 2, 1,stock 2, no
id_100003,num 2, 1,stock 2, no

Answer

Try:

install.packages('dplyr')
library(dplyr)

mytext1 = "col_id,stock, ch2
id_100,stock 2, yes
id_100002,stock 2, no
id_100003,stock 2, no"
mydf1 <- read.table(text=mytext1, header=T, sep=",")

mytext2 = "col_id,num, cat1
id_100,num 2, 0
id_100,num 2, 1
id_100,num 2, 0
id_100002,num 2, 1
id_100002,num 2, 1
id_100002,num 2, 1
id_100003,num 2, 1
id_100003,num 2, 1"

mydf2 <- read.table(text=mytext2, header=T, sep=",")
output_df <- left_join(mydf2,mydf1, by="col_id")

  col_id    num    cat1  stock   ch2
 id_100    num 2    0   stock 2  yes
 id_100    num 2    1   stock 2  yes
 id_100    num 2    0   stock 2  yes
 id_100002 num 2    1   stock 2   no
 id_100002 num 2    1   stock 2   no
 id_100002 num 2    1   stock 2   no
 id_100003 num 2    1   stock 2   no
 id_100003 num 2    1   stock 2   no
Comments