I have created a graph in python lib NetorwkX and I want to implement a modularity algorithm in order to cluster the nodes of my graph. I came across the following code:
import matplotlib.pyplot as plt
import networkx as nx
G = nx.Graph()
G = nx.read_weighted_edgelist('graphs/fashionGraph_1.edgelist')
# Find modularity
part = community.best_partition(G)
mod = community.modularity(part,G)
# Plot, color nodes using community structure
values = [part.get(node) for node in G.nodes()]
nx.draw_spring(G, cmap=plt.get_cmap('jet'), node_color = values, node_size=30, with_labels=False)
From the documentation:
Node color. Can be a single color format string, or a sequence of colors with the same length as nodelist. If numeric values are specified they will be mapped to colors using the cmap and vmin,vmax parameters. See matplotlib.scatter for more details.
part = community.best_partition(G) assigns a community to each node -
part is a dict, and
part[node] is the community the node belongs to (each is assigned an integer). Later
values = [part.get(node) for node in G.nodes()] creates a list with the community number for each node in the order the nodes appear in
Then in the plotting command, it will use those community numbers to determine the color of the nodes. All nodes which have been assigned to the same community will be given the same color.
The physical locations of the nodes are assigned by the spring layout. You can see that the spring layout seems to be putting the nodes into positions that suggest some communities that are different from what
community.best_partition finds. This is perhaps mildly surprising, but certainly nothing prevents it. It does make me think that the algorithm you've used doesn't appropriately account for all of the structure in the network. The documentation for
best_partition gives some explanation of the underlying algorithm.