ewhai - 4 months ago 67

R Question

I make a plot about correlation of terms in text mining.

And I would like to put the correlation value besied the line like the image bellow.

What should I add next to plot()? text()? or is there some other option to do it?

`freq.terms<-findFreqTerms(dtm, lowfreq=500)[1:25]`

plot(dtm,term=freq.terms,corThreshold=0.25,weighting=T)

Answer

Here's where I'm at. The main idea is to make a list of edge attributes that we can pass into `plot`

.

```
library(tm)
library(graph)
library(igraph)
# Install Rgraphviz
source("http://bioconductor.org/biocLite.R")
biocLite("Rgraphviz")
data("acq")
dtm <- DocumentTermMatrix(acq,
control = list(weighting = function(x) weightTfIdf(x, normalize=FALSE),
stopwords = TRUE))
freq.terms <- findFreqTerms(dtm, lowfreq=10)[1:25]
assocs <- findAssocs(dtm, term=freq.terms, corlimit=0.25)
# Recreate edges, using code from plot.DocumentTermMatrix
m <- dtm
corThreshold <- 0.25
m <- as.matrix(m[, freq.terms])
c <- cor(m)
c[c < corThreshold] <- 0
c[is.na(c)] <- 0
diag(c) <- 0
ig <- graph.adjacency(c, mode="undirected", weighted=TRUE)
g1 <- as_graphnel(ig)
# Make edge labels
ew <- as.character(unlist(edgeWeights(g1)))
ew <- ew[setdiff(seq(along=ew), Rgraphviz::removedEdges(g1))]
names(ew) <- edgeNames(g1)
eAttrs <- list()
elabs <- paste(" ", round(as.numeric(ew), 2)) # so it doesn't print on top of the edge
names(elabs) <- names(ew)
eAttrs$label <- elabs
fontsizes <- rep(7, length(elabs))
names(fontsizes) <- names(ew)
eAttrs$fontsize <- fontsizes
plot(dtm, term=freq.terms, corThreshold=0.25, weighting=T,
edgeAttrs=eAttrs)
```

The main remaining problem is that the plot prints the edge labels twice: once using default settings, apparently, and another time using the fontsize that we specified in `eAttrs`

.

**Edit.** It seems that in order to get the labels to render correctly, we can't use `plot`

directly. Using `renderGraph`

(which `plot`

calls) seems to work. We make a numeric vector for the edge weights, and pass this into `renderEdgeInfo`

as the `lwd`

argument. You'll have to change the manual offset for the labels (elabs <- `paste(" ",...)`

) so that the labels are the right distance away from the edges.

```
weights <- as.numeric(ew)
names(weights) <- names(ew)
edgeRenderInfo(g1) <- list(label=elabs, fontsize=fontsizes, lwd=weights*5)
nodeRenderInfo(g1) <- list(shape="box", fontsize=20)
g1 <- layoutGraph(g1)
renderGraph(g1)
```