pbc - 7 months ago 42

R Question

I have two tables in R (females and males) with presence-absence data. I'd like to do pairwise comparisons between them (row-by-row) to find the number of cells not shared between each pair (i.e the sum of cells equal to 1 on the female but not on the male and vice-versa).

I know that the cross product (%*%) does the opposite of what I need. It creates a new matrix containing the sum of shared cells between pairs of males and females (i.e sum um cells equal to 1 in both).

Here is an example dataset:

`females <- as.data.frame(matrix(c(0,0,0,1,1,0,1,0,1,0,1,0,1,0,1,1,1,0,1,1,1,0,1,1,1), nrow=5, byrow=T))`

males <- as.data.frame(matrix(c(1,0,0,1,1,0,1,0,1,1,1,0,1,0,1,1,1,0,1,1,1,0,1,0,1), nrow=5, byrow=T))

rownames(females) <-c ("female_1","female_2","female_3","female_4","female_5")

rownames(males) <-c ("male_1","male_2","male_3","male_4","male_5")

So, if I do the cross product

`as.matrix(females) %*% t(as.matrix(males))`

I get this

`male_1 male_2 male_3 male_4 male_5`

female_1 2 2 1 2 1

female_2 1 2 0 2 0

female_3 2 1 3 2 3

female_4 3 3 2 4 2

female_5 3 2 3 3 3

But I need this (only first row shown)

`male_1 male_2 male_3 male_4 male_5`

female_1 1 1 3 2 3

.

.

In reality, my dataset is not symmetrical (I have 47 females and 32 males).

Thanks for any help!!!

Answer

Set up an object to receive results:

```
xy <- matrix(NA, nrow(females), nrow(males))
for ( x in 1:nrow(females) ){
for(y in 1:nrow(males) ){
xy[x,y] <- sum(females[x, 1:ncol(females)] != males[y,1:ncol(males)])}}
```

Should have done with nested sapply calls as well and might have been a bit cleaner since there was no need to have a separate "setup", (but only a little bit cleaner, and contrary to popular myth not any faster):

```
xy <- sapply( 1:nrow(females) ,
function(x) sapply( 1:nrow(males) ,
function(y) sum( females[x, 1:ncol(females)] != males[y,1:ncol(males)]) ))
xy
#-----
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 2 1 1
[2,] 1 1 4 1 3
[3,] 3 5 0 3 1
[4,] 2 2 3 0 2
[5,] 3 5 0 3 1
dimnames(xy) <- list( rownames(females), rownames(males) )
```

Source (Stackoverflow)