mce - 8 months ago 26

R Question

I just was wondering if there was an easy way to compute the maximal number of identical elements between any two columns of a matrix in R.

For example, I have a matrix

`test <- replicate(10, sample((0:3), 10, replace = TRUE))`

test

[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]

[1,] 3 0 1 0 2 2 1 0 2 0

[2,] 1 1 3 2 0 2 3 0 2 2

[3,] 2 3 0 0 1 2 0 3 0 2

[4,] 2 2 1 1 2 0 0 1 1 0

[5,] 2 0 1 2 0 1 1 1 0 0

[6,] 1 0 1 3 2 3 3 1 3 2

[7,] 0 1 3 2 1 0 1 2 1 1

[8,] 0 3 1 3 0 2 3 1 1 1

[9,] 2 3 1 3 0 1 0 1 3 2

[10,] 3 2 1 0 2 1 3 2 3 1

To compare column 1 and 2 I use

`table(test[,1] == test[,2])`

FALSE TRUE

8 2

So there are two identical elements between these two columns.

I could now repeat this for all pairs of columns using two nested for loops and then find the maximum number of TRUE calls but this does not look nice. Can anyone think of a better way?

Cheers,

Maik

Answer

Try:

```
max(combn(split(test, col(test)), 2, function(x) sum(x[[1]] == x[[2]])))
```

If you want to know which pair has the greatest number of equal elements it's a little more complicated.