snake plissken - 3 months ago 70
Python Question

# Graph modularity in python networkx

I have created a graph in python lib NetorwkX and I want to implement a modularity algorithm in order to cluster the nodes of my graph. I came across the following code:

``````import community
import matplotlib.pyplot as plt
import networkx as nx

G = nx.Graph()

nx.transitivity(G)

# Find modularity
part = community.best_partition(G)
mod = community.modularity(part,G)

# Plot, color nodes using community structure
values = [part.get(node) for node in G.nodes()]
nx.draw_spring(G, cmap=plt.get_cmap('jet'), node_color = values, node_size=30, with_labels=False)
plt.show()
``````

My graph has 4267 and 3692 edges. The resulting plot is this:

I am a bit confused on how the nodes of the graph is clustered. Which are exactly the logic of the colors?

`part = community.best_partition(G)` assigns a community to each node - `part` is a dict, and `part[node]` is the community the node belongs to (each is assigned an integer). Later `values = [part.get(node) for node in G.nodes()]` creates a list with the community number for each node in the order the nodes appear in `G.nodes()`.
The physical locations of the nodes are assigned by the spring layout. You can see that the spring layout seems to be putting the nodes into positions that suggest some communities that are different from what `community.best_partition` finds. This is perhaps mildly surprising, but certainly nothing prevents it. It does make me think that the algorithm you've used doesn't appropriately account for all of the structure in the network. The documentation for `best_partition` gives some explanation of the underlying algorithm.