Manus Manus - 2 months ago 13
R Question

How to create an adjacency matrix from raw data which is non-numeric in nature

An example of the input I am working on is given below:

User ID 1 --- Artist 5
User ID 2 --- Artist 1
User ID 3 --- Artist 7
User ID 4 --- Artist 2
User ID 5 --- Artist 3
User ID 1 --- Artist 2
User ID 3 --- Artist 1


The above data is a record of music listened to by users of an app.

I would like to generate an adjacency matrix corresponding to the below given example:

ARTIST 1 ARTIST 2 ARTIST 3 ARTIST 4 ARTIST 5 ARTIST 6 ARTIST 7
USER ID 1 0 1 0 0 1 0 0
USER ID 2 1 0 0 0 0 0 0
USER ID 3 1 0 0 0 0 0 1
USER ID 4 0 1 0 0 0 0 0
USER ID 5 0 0 1 0 0 0 0


How would this be possible in R. Any tips or pointers would be most appreciated.

Thank you in advance for your time and help.

Answer

This works:

# get data in useable form
ContingencyTable <- read.table(text=gsub(pattern = " --- ", replacement = ",","User ID 1 --- Artist 5
User ID 2 --- Artist 1
User ID 3 --- Artist 7
User ID 4 --- Artist 2
User ID 5 --- Artist 3
User ID 1 --- Artist 2
User ID 3 --- Artist 1"),sep=",", stringsAsFactors = FALSE)
# add variable for match value
ContingencyTable$Val <- 1
# more or less lifted from Arun's answer linked by @Hong Ooi, above
adjMat <- reshape2::dcast(ContingencyTable, V1 ~ V2, value.var = "Val", fill=0)
rownames(adjMat) <- adjMat[,1]
adjMat <- adjMat[,2:ncol(adjMat)]

adjMat
        Artist 1 Artist 2 Artist 3 Artist 5 Artist 7
User ID 1        0        1        0        1        0
User ID 2        1        0        0        0        0
User ID 3        1        0        0        0        1
User ID 4        0        1        0        0        0
User ID 5        0        0        1        0        0