Dambo Dambo - 2 months ago 11
R Question

How to create a network chart?

I am trying to use

networkD3::forceNetwork
to create a chart of employers and colleges from which employers hire employees.

Right now, I have something like this:

forceNetwork(Links= Links, Nodes= netDf ,
Source = 'collegeName', Target = 'organizationName', Value='count',
NodeID = 'collegeName', Group = 'organizationName')


But the output doesn't look as expected. What I would like to have, is:


  1. One bubble for each college


    1. One bubble for each employer

    2. Colleges connected to employer, with number of employers (
      count
      ) mapped to the width of the connection lines.




Colleges are never connected to each other, and the same holds for employers.

This is the dataset I am using
netDf
:

structure(list(collegeName = c("college1", "college1", "college2",
"college3", "college3", "college3", "college4", "college5", "college5",
"college6", "college6", "college6", "college7", "college7", "college7",
"college8", "college9", "college10", "college10", "college11"
), organizationName = c("employer2", "employer3", "employer2",
"employer1", "employer2", "employer3", "employer2", "employer2",
"employer3", "employer1", "employer2", "employer3", "employer1",
"employer2", "employer3", "employer2", "employer2", "employer2",
"employer3", "employer2"), count = c(858, 176, 461, 201, 2266,
495, 430, 1992, 290, 127, 1754, 549, 136, 2839, 686, 638, 275,
1388, 387, 188), group = c(2, 3, 2, 1, 2, 3, 2, 2, 3, 1, 2, 3,
1, 2, 3, 2, 2, 2, 3, 2)), .Names = c("collegeName", "organizationName",
"count", "group"), row.names = c(NA, -20L), class = "data.frame")


And this is the
Links
dataset:

structure(list(collegeName = c(0, 0, 1, 2, 2, 2, 3, 4, 4, 5,
5, 5, 6, 6, 6, 7, 8, 9, 9, 10), organizationName = c(1, 2, 1,
0, 1, 2, 1, 1, 2, 0, 1, 2, 0, 1, 2, 1, 1, 1, 2, 1), count = c(858,
176, 461, 201, 2266, 495, 430, 1992, 290, 127, 1754, 549, 136,
2839, 686, 638, 275, 1388, 387, 188), group = c(2, 3, 2, 1, 2,
3, 2, 2, 3, 1, 2, 3, 1, 2, 3, 2, 2, 2, 3, 2)), .Names = c("collegeName",
"organizationName", "count", "group"), row.names = c(NA, -20L
), class = "data.frame")


Also, would it be possible to map a 4th variable to the bubble size? Say for instance that I want to map
count
to che size of the bubble pertaining the employees, how can I do that?

Answer

I think your Links and Nodes data frames do not meet the requirements as specified in ?forceNetwork. Instead, you could do:

library(networkD3)
set.seed(1)

nodes <- data.frame(Label = unique(c(netDf[,1], netDf[,2])))
nodes$Group <- as.factor(substr(nodes$Label, 1, 3))
nodes <- merge(
  nodes, 
  aggregate(count~organizationName, netDf, sum), 
  by.x="Label", by.y="organizationName", 
  all.x=TRUE
)
nodes$count[is.na(nodes$count)] <- 1

links <- transform(netDf, 
  Source = match(netDf$collegeName, nodes$Label)-1, 
  Target = match(netDf$organizationName, nodes$Label)-1
)

forceNetwork(
  Links = transform(links, count = count/min(count)), 
  Nodes = nodes, 
  Source = 'Source', 
  Target = 'Target', 
  Value='count',
  NodeID = 'Label', 
  Group = "Group", 
  Nodesize = "count",
  legend = TRUE, 
  opacity = 1,
  radiusCalculation = JS("Math.log(d.nodesize)+6")
)

enter image description here

Comments