sp2 sp2 - 2 months ago 7
R Question

Comparing multiple rows and creating a matrix in R or in Excel

I have a file containing, multiple rows as follows

In file1:

a 8|2|3|4 4
b 2|3|5|6|7 5
c 8|5|6|7|9 5


a to a has 4 overlaps, similarly a to b had 2 overlaps, so to check the overlaps between various entity,I need to generate a matrix with the above details, and the output should be a matrix like

a b c
a 4 2 1
b 2 5 3
c 1 3 5


Please give me a suggestion, how to do this? this there anyway to do this using excel or using a shell script or using R? I have written this following code but since I am not a good coder, cudn't get the output printed in a right format.

setwd('C:\\Users\\Desktop\\')
newmet1<-file("file.txt")
newmet2<-strsplit(readLines(newmet1),"\t")
Newmet<-sapply(newmet2, function(x) x[2:length(x)], simplify=F )

for (i in 1:length(Newmet))
{
for (j in 1:length(Newmet)
{
c <- ((intersect(Newmet[[i]], Newmet[[j]]))
print (length(c))
}
}

Answer

If the numbers are in separate cells starting in Sheet1!A1, try

=SUM(--ISNUMBER(MATCH(Sheet1!$A3:$E3,INDEX(Sheet1!$A$1:$E$3,COLUMN(),0),0)))

starting at Sheet2!A1.

Must be entered as an array formula using CtrlShiftEnter

Alternative formula that doesn't have to start at Sheet2!A1

SUM(--ISNUMBER(MATCH(Sheet1!$A1:$E1,INDEX(Sheet1!$A$1:$E$3,COLUMNS($A:A),0),0)))

enter image description here