pyrookie - 5 months ago 37

Python Question

I have a >10k list of (unordered) pairs of numbers. I'd like to classify them into sets of connected pairs either directly or indirectly. I think this corresponds to undirected graph. I'm using python, and tried something like this to represent this structure.

In order to know all the numbers connected to

`i`

I can examine whether there is a path from

`i`

`j`

`j`

`i`

Answer

It sounds as though you are interested in computing the connected components of a graph. I would suggest looking into the networkx package and its tools for computing components.

For example, suppose our data is a list of pairs of numbers, each pair representing an edge in the graph:

```
pairs = [
(1, 2),
(2, 4),
(3, 5),
(2, 5),
(7, 9),
(9, 10),
(8, 7)
]
```

In the graph represented by these edges, there is a path between any pair of nodes in the set `{1, 2, 3, 4, 5}`

, and there is also a path between any pair of nodes in `{6, 7, 8, 9, 10}`

. But there is no path, say, from `5`

to `7`

. This is to say that there are two connected components in the graph.

To discover these components, we first import `networkx`

and create a graph:

```
>>> import networkx as nx
>>> graph = nx.from_edgelist(pairs)
```

Computing the components is as simple as

```
>>> list(nx.connected_components(graph))
>>> [{1, 2, 3, 4, 5}, {6, 7, 8, 9, 10}]
```

`nx.connected_components`

is a generator, and so here we converted the result into a list in order to show all of the connected components.

We can also find the connected component containing a given node:

```
>>> nx.node_connected_component(graph, 3)
{1, 2, 3, 4, 5}
```

We can also quickly count the number of connected components:

```
>>> nx.number_connected_components(graph)
2
```