pbc pbc - 23 days ago 7
R Question

Find pairwise matches between two datasets by row (every cell)

I have two tables (females and males) with presence-absence data and I'd like to do pairwise comparisons between them (row-by-row) to find the number of cells shared by each pair (cell content = 1 in both) .

I've seen similar questions here in SO, but most try to find the differences of full row content (not cell-by-cell). I believe what I need is similar to this post but wasn't able to implement it with my specific case.

Here is an example (although I actually have .csv tables from Excel)

females <- as.data.frame(matrix(c(0,0,0,1,1,0,1,0,1,0,1,0,1,0,1,1,1,0,1,1,1,0,1,1,1), nrow=5, byrow=T))
males <- as.data.frame(matrix(c(1,0,0,1,1,0,1,0,1,1,1,0,1,0,1,1,1,0,1,1,1,0,1,0,1), nrow=5, byrow=T))
rownames(females) <-c ("female_1","female_2","female_3","female_4","female_5")
rownames(males) <-c ("male_1","male_2","male_3","male_4","male_5")


My ultimate goal is to have a new dataset with females on the rows and males on the columns with the number of shared cells between every pair possible (I think I can get this part done with reshape once I figure out the other part).

male_1 male_2 male_3 male_4 male_5
female_1 2 2 1 2 1
.
.
.
.


I appreciate any help!

Answer

The crossproduct of the two matrices is what you want... Or maybe it is the dot-product, whatever it is called this should do it:

as.matrix(females) %*% t(as.matrix(males))



        male_1 male_2 male_3 male_4 male_5
female_1      2      2      1      2      1
female_2      1      2      0      2      0
female_3      2      1      3      2      3
female_4      3      3      2      4      2
female_5      3      2      3      3      3
Comments