user54101 user54101 - 19 days ago 7
R Question

R igraph: shortest path extraction

This is the first time I am working with graphs and

R
igraph
package and I need some help with processing graph objects.

What I want to achieve:

From a given contact matrix extract shortest confident path between nodes. By confident I mean that edge weights are higher then neighbouring edges.

Examples:

A

m <- read.table(row.names = 1, header = TRUE, text =
" A B C D E F
A 0 1 1 1 1 5
B 1 0 1 1e2 1e2 1
C 1 1 0 1 1 1
D 1 1e2 1 0 1e2 1
E 1 1e2 1 1e2 0 1
F 5 1 1 1 1 0")
m <- as.matrix(m)
ig <- graph.adjacency(m, mode = "undirected", weighted = TRUE, diag = FALSE)
sp <- shortest.paths(ig, algorithm = "dijkstra")


In matrix
m
there is one cluster (clique?) between
B-D-E
(ie., egde weights between those nodes are high). However, as there is weight between
A
and
F
I am also getting cluster there, even though edge weight is low (only 5).

Question A: How to extract only those clusters that have high edge weight? I can transform those contacts to 0 with
m[which(m <= 5)] <- 0
, but I hope that there is more "mathy" solution for this in
igraph
package.

B

m <- read.table(row.names = 1, header = TRUE, text =
" A B C D E F
A 0 1 1 5 1 1
B 1 0 1 1e2 1e2 1
C 1 1 0 1 1 1
D 5 1e2 1 0 1e2 1
E 1 1e2 1 1e2 0 1
F 1 1 1 1 1 0")
m <- as.matrix(m)
ig <- graph.adjacency(m, mode = "undirected", weighted = TRUE, diag = FALSE)
sp <- shortest.paths(ig, algorithm = "dijkstra")


In matrix
m
there is cluster between
B-D-E
, but as there is low weight between
A
and
B
-
A
is also connected to this cluster.

Question B: How to not assign nodes to a cluster if edge weight is low?

This is my first question here, if you need clarification or better examples I will improve my questions.

Answer

First, it is good to know that when looking up paths, igraph understands weights as costs, i.e. on edges with higher weight it costs more to travel, so it will consider shorter the paths with lower sum weight. It is easy to turn this into the opposite, just take the reciprocal of your weights (1 / E(ig)$weight). Between 2 vertices there might be only one shortest path, but sometimes there are more equally short paths. You can look up all of them (all_shortest_paths), or tell igraph to return only one of the shortests for each pairs of vertices (shortest_paths). Each call of these methods returns the paths from one selected vertex, to have the paths between all pairs, you need to call these once for each vertex (ok, at an undirected graph, it is enough to call for half of the vertices). To formulate what I explained until this point:

spaths <- lapply(V(ig),
                 function(v){
                     all_shortest_paths(ig, v,
                                        weight = 1 / E(ig)$weight
                     )
                 }
           )

Here spaths will be a list of lists, access the paths from C to all the vertices like this:

spaths$C$res

[[1]]
+ 2/6 vertices, named:
[1] C A
[[2]]
+ 2/6 vertices, named:
[1] C B
[[3]]
+ 1/6 vertex, named:
[1] C
[[4]]
+ 2/6 vertices, named:
[1] C D
[[5]]
+ 2/6 vertices, named:
[1] C E
[[6]]
+ 2/6 vertices, named:
[1] C F

spaths$C$res[[2]] # this is the path from `C` to `B`,
                  # a vector of 2 vertices

Note, the third element is actually from C to itself, you can either ignore it, or provide a vector of all other vertices to the to parameter of all_shortest_paths. Also, in your example all shortest paths will be of length 1, but if I set for example the weight of B--E to 1 instead of 100, we see that the method works, and from B to E the shortest path will be B-D-E.

Regarding your second question, here it is not completely clear what do you want to achieve, especially how do you get these clusters? If you want to find communities, i.e. more closely connected group of vertices, taking into account also the edge weights, there are many methods for this, all those named cluster_[...] or community.[...] in igraph. For example, if we run the fastgreedy method on your graph, it will detect the cluster you mentioned:

fg <- fastgreedy.community(ig, weights = E(ig)$weight)

IGRAPH clustering fast greedy, groups: 2, mod: 0.059
+ groups:
$`1`
[1] "A" "C" "F"
$`2`
[1] "B" "D" "E"

So here we have the B, D, E cluster, what is connected with higher weight edges. If we run the same method without weights, all the vertices will belong to one group (fastgreedy.community(ig, weights = NULL)). Note, at community detection, igraph understands weights as strength, so vertices connected with higher weight edges more likely to cluster together, this is kind of opposite like how it works at calculating paths.