Ozgur AlptekÄ±n - 1 year ago 80
R Question

# How can I calculate cosine similarity between first row of my matrix with each other rows in R?

this is my_matrix :

``````ui 194635691 194153563 177382028 177382031 195129144 196972549 196258704   194907960 196950156 194139014 153444738 192982501 192891196
1 237      0.00      0.00      0.00      0.00      0.00      0.00         0      0.01         0         0         0         0         0
2 261      0.01      0.00      0.00      0.00      0.00      0.00         0      0.00         0         0         0         0         0
3 290      0.00      0.00      0.01      0.01      0.00      0.00         0      0.00         0         0         0         0         0
4 483      0.00      0.00      0.00      0.00      0.00      0.01         0      0.00         0         0         0         0         0
5 533      0.00      0.01      0.00      0.00      0.00      0.00         0      0.00         0         0         0         0         0
6 534      0.00      0.00      0.00      0.00      0.01      0.00         0      0.00         0         0         0         0         0
``````

these are my codes are following:

``````b=my_matrix[1,2:length(my_matrix)]

for (i in nrow(my_matrix)) {
res[i]=cosine(b,my_matrix[i,2:length(my_matrix)])
}
``````

I used "lsa" package and
I want to get a cosine similarity matrix that calculate b vector with every other vectors from matrix a but my codes throw a error that says :

``````argument mismatch. Either one matrix or two vectors needed as input.
``````

What Should I do to fix my problem?

Package "isa", which is not available for R version 3.2.2, is not really necessary. Just do it yourself, using the definition of cosine similarity:

``````my_matrix <- as.matrix(my_matrix)  # Make sure that "my_matrix" is indeed a "matrix".
v <- as.vector(my_matrix[1,-1])
M <- my_matrix[-1,-1]
cosSim <- ( M %*% v ) / sqrt( sum(v*v) * rowSums(M*M) )
``````

The first line is only necessary if `my_matrix` is not yet a `matrix` but a `data.frame`.

A possible explanation for the original error message shown in the question:

I guess the class of the object `my_matrix` that was used in the code presented in the question and caused the error message

argument mismatch. Either one matrix or two vectors needed as input.

was `data.frame`, not a `matrix`. If so, the arguments `b` and `my_matrix[i,2:length(my_matrix)]` in the call of the `cosine` function are again data.frames, not a vector and a matrix as exspected.

As an aside:

Even if `my_matrix` is coerced to a `matrix` the code in the question will throw an error massage, since `length(my_matrix)` is larger than the number of columns and hence `my_matrix[i,2:length(my_matrix)]` selects undefined columns. The `i`-th row of `my_matrix` without the first column is `my_matrix[i,2:ncol(my_matrix)]` or shorter `my_matrix[i,-1]`.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download