Khalil Ardat - 6 months ago 46

R Question

I have two tables with different row numbers. I would like to merge the tables based on the content of two columns. However, the catch is I don't want the order of the variables to matter when merging. Example:

`Gene1 Gene2 p-value`

TP53 ARID1A 0.001

ATM ATR 0.0005

`Gene1 Gene2 p-value`

ARID1A TP53 0.0007

ATM ATR 0.004

I tried:

`merge(Table1, Table2, by = c("Gene1", "Gene2"), all.x = TRUE)`

But the problem is that it will only merge 'ATM' and 'ATR' but not 'TP53' and 'ARID1A' because they are not in the same order.

Is there a way to merge the two tables irrespective of the column order?

Answer

Using `sqldf`

:

```
library(sqldf)
sqldf("
SELECT df1.*,
df2.`p.value`
FROM df1, df2
WHERE (df1.Gene1 = df2.Gene1 AND
df1.Gene2 = df2.Gene2) OR
(df1.Gene1 = df2.Gene2 AND
df1.Gene2 = df2.Gene1)")
# Gene1 Gene2 p.value p.value
# 1 TP53 ARID1A 1e-03 7e-04
# 2 ATM ATR 5e-04 4e-03
```