I'm new to networkx and could do with some help please.
I have a set of data which I've processed to generate the nodes and edges. There are around 5000 groups of nodes that have more than 2 links within them (up to 10 nodes in the group in total). But the problem is that there are also several thousand pairs of nodes that only have 1 edge between them, i.e node a is linked to node b but neither are linked to any other node.
I want to remove these paired nodes from the chart.
Is there a way to filter these out?
So our goal is to remove all nodes from components with less than 3 nodes (this includes isolated nodes if they exist).
for component in list(nx.connected_components(G)): if len(component)<3: for node in component: G.remove_node(node)
A small warning is in order when using
nx.connected_components. It returns a generator of components. If I didn't put
list around it, it would generate one at a time, and then perform the steps for the given component. Once all that is done, it would generate the next component. But because
G has been modified, python can't be sure that this behaves well. So it would die (complaining that a dictionary changed size --- the number of nodes in
G changed). By turning it into a list, the components are all found before it starts the loop. So the graph won't be changing while the components are being found.