Yi Zhou Yi Zhou - 3 months ago 8
R Question

R removing extra rows in a matrix compared with another matrix

I'll illustrate my question with an example:

a1
# x y
#Adam 1 a
#Mike 2 b
#Mary 3 c

a2
# i j
#Adam 4 e
#Mary 5 f


And what I want is this:

a3 x y
#Adam 1 a
#Mary 3 c


where the values in a1 don't change, and rows that don't appear in a2 are removed (by rowname only). I searched for quite a while and none of the packages, such as
compare
or
data.frame
worked for me.

I'm totally new to R, and working on quite a large dataset, so please help me with a solution that runs quick :)

Thanks!

Answer

Posting my comment as an answer:

a3 = a1[rownames(a1) %in% rownames(a2), ]

This is pretty simple. We look at which row names from a1 are also in a2, and we keep those rows. It should work whether your data is in a matrix or a data.frame.

As I commented on the other answer, this could also be viewed as an inner join operation. You can do an inner join with base::merge, or if you really need speed with the data.table or dplyr packages. But to use those you will need to convert your matrices to data frames and to add the row names as their own column. If your data really is in a matrix, just using the rownames() method I show above will probably be efficient enough.

Comments