WAW WAW - 1 month ago 23
R Question

How to calculate Jaccard similarity between two data frame with in R

I have two binary dataframes

c(0,1)
, and I didn't find any method which calculates the Jaccard similarity coefficient between both dataframes. I have seen methods that do this calculation between the columns of a single data frame.

Lets say
DF1


DF1 <- data.frame(a=c(0,0,1,0),
b=c(1,0,1,0),
c=c(1,1,1,1))


and
DF2
:

DF2 <- data.frame(a=c(0,0,0,0),
b=c(1,0,1,0),
c=c(1,0,1,1))


What I am looking is a single Jaccard similarity coefficient between the two data frame (not column by column)

Could you help me with this ?

Answer

You can use dist:

dist(t(cbind(unlist(DF1), unlist(DF2))), "binary")
# 0.2857143

The distance would be 1 for DF2 <- as.data.frame(xor(DF1, 1) +0L) and 0 for DF2 <- DF1.

Comments