disposedtrolley - 1 year ago 62

R Question

I have a dataset similar to the following:

`AuthorID ThreadID`

1 A

2 A

1 A

2 A

2 C

3 B

1 C

4 B

4 C

4 C

where

`AuthorID`

`ThreadID`

I'm after a weighted adjacency matrix in R which I can use with igraph, that shows the number of times a particular

`AuthorID`

`AuthorID`

`ThreadID`

`AuthorID`

`1 2 3 4`

1 . 3 0 1

2 . . 0 1

3 . . . 1

4 . . . .

Thanks in advance!

Answer Source

Here's a solution using base R function. First, your sample data in a easily copy/paste-able format

```
dd<-read.table(text="AuthorID ThreadID
1 A
2 A
1 A
2 A
2 C
3 B
1 C
4 B
4 C
4 C
", header=T)
```

Then you can do

```
x <- xtabs(~ThreadID+AuthorID, unique(dd));
mm <- crossprod(x,x)
mm[lower.tri(mm, TRUE)] <- NA
```

to get

```
AuthorID
AuthorID 1 2 3 4
1 NA 2 0 1
2 NA NA 0 1
3 NA NA NA 1
4 NA NA NA NA
```

We use `xtabs`

to count occurrences. We make sure to use `unique`

so we don't count an author on a thread twice (to agree with your desired output. Then we use `crossprod`

to get the author-author frequencies from the author-thread table. Finally we use `lower.tri`

to get rid of the lower triangle as per your desired output.