LanceHAOH LanceHAOH - 3 months ago 6
C++ Question

Why get minimum vertex in Prim's MST algorithm?

From what I understand, Prim's MST algorithm will traverse all vertices in the graph choosing ONE best edge to travel to each vertex. Hence,each iteration will choose an optimal cost for each neighbouring vertex. Hence, no matter which vertex is used first, the end result should be the same, as the optimal costs are chosen even before the selection of the next vertex.

As such, I do not understand why the algorithm has to choose the vertex which has the least cost in every iteration. To make my description clearer, I have included sample code and diagram from geeksforgeeks.org:

Source: www.geeksforgeeks.org

// A C / C++ program for Prim's Minimum Spanning Tree (MST) algorithm.
// The program is for adjacency matrix representation of the graph

#include <stdio.h>
#include <limits.h>

// Number of vertices in the graph
#define V 5

// A utility function to find the vertex with minimum key value, from
// the set of vertices not yet included in MST
int minKey(int key[], bool mstSet[])
{
// Initialize min value
int min = INT_MAX, min_index;

for (int v = 0; v < V; v++)
if (mstSet[v] == false && key[v] < min)
min = key[v], min_index = v;

return min_index;
}

// A utility function to print the constructed MST stored in parent[]
int printMST(int parent[], int n, int graph[V][V])
{
printf("Edge Weight\n");
for (int i = 1; i < V; i++)
printf("%d - %d %d \n", parent[i], i, graph[i][parent[i]]);
}

// Function to construct and print MST for a graph represented using adjacency
// matrix representation
void primMST(int graph[V][V])
{
int parent[V]; // Array to store constructed MST
int key[V]; // Key values used to pick minimum weight edge in cut
bool mstSet[V]; // To represent set of vertices not yet included in MST

// Initialize all keys as INFINITE
for (int i = 0; i < V; i++)
key[i] = INT_MAX, mstSet[i] = false;

// Always include first 1st vertex in MST.
key[0] = 0; // Make key 0 so that this vertex is picked as first vertex
parent[0] = -1; // First node is always root of MST

// The MST will have V vertices
for (int count = 0; count < V-1; count++)
{
// Pick thd minimum key vertex from the set of vertices
// not yet included in MST
int u = minKey(key, mstSet);

// Add the picked vertex to the MST Set
mstSet[u] = true;

// Update key value and parent index of the adjacent vertices of
// the picked vertex. Consider only those vertices which are not yet
// included in MST
for (int v = 0; v < V; v++)

// graph[u][v] is non zero only for adjacent vertices of m
// mstSet[v] is false for vertices not yet included in MST
// Update the key only if graph[u][v] is smaller than key[v]
if (graph[u][v] && mstSet[v] == false && graph[u][v] < key[v])
parent[v] = u, key[v] = graph[u][v];
}

// print the constructed MST
printMST(parent, V, graph);
}


// driver program to test above function
int main()
{
/* Let us create the following graph
2 3
(0)--(1)--(2)
| / \ |
6| 8/ \5 |7
| / \ |
(3)-------(4)
9 */
int graph[V][V] = {{0, 2, 0, 6, 0},
{2, 0, 3, 8, 5},
{0, 3, 0, 0, 7},
{6, 8, 0, 0, 9},
{0, 5, 7, 9, 0},
};

// Print the solution
primMST(graph);

return 0;
}


We can see from the following code block that every iteration will choose optimal weight for each neighbour (in this case there is only one edge connecting any two vertices):

if (graph[u][v] && mstSet[v] == false && graph[u][v] < key[v])
parent[v] = u, key[v] = graph[u][v];


Let's take vertex 7 as example, it has 3 edges. edge 6-7 will eventually be selected from edges 0-7, 8-7 and 6-7 as its weight is minimum, regardless of whether we evaluate vertex 0-7, 8-7 or 6-7 first because the optimum weight = min(weight of all adjacent edges). Hence, it seems redundant to choose the minimum weight vertex every iteration.

Could someone pls explain to me what is the purpose of selecting the minimum weight vertex for every iteration as in the following code block?

// Pick thd minimum key vertex from the set of vertices
// not yet included in MST
int u = minKey(key, mstSet);

Answer

Let's take vertex 7 as example, it has 3 edges. edge 6-7 will eventually be selected from edges 0-7, 8-7 and 6-7 as its weight is minimum, regardless of whether we evaluate vertex 0-7, 8-7 or 6-7 first because the optimum weight = min(weight of all adjacent edges). Hence, it seems redundant to choose the minimum weight vertex every iteration.

Looks like you're confusing purposes of the two loops. The inner loop does not select anything. It just calculates weights. It's the outer loop that does the selection.

Look at it this way. Imagine just one loop for a start, and that we have already picked up a random starting vertex. Now this loop needs to pick up an edge. Of course it goes over the adjacent edges and picks up the minimum. It doesn't matter how it does it: whether the weights are assigned to vertices or it just loops over the edges and picks up the best one.

However, things get complicated when there are more and more vertices. Once you add yet another vertex, you may have discovered a better way to get a new one. Suppose you start at vertex 2. You add vertex 8. Now you can go to 1 from 2 (weight 8), 3 from 2 (weight 7), 6 from 8 (weight 6), 7 from 8 (weight 7). However, as soon as you go to 6 from 8, you now have a better way to go to 7 from 6 with weight just 1, as opposed to weight 7 (edge 8–7). So you need to update your notion of the best path.

One way to do it would be to simply iterate over all edges adjacent to every vertex already included in the MST set on each iteration and pick the best one. This introduces two inner loops to find the minimum: one over the vertices in the MST set, another one over the edges adjacent to each such vertex. For a nearly-complete graph with n vertices and about n^2 edges you'll get O(n^3) in total.

So what this algorithm does instead, it loops over just vertices instead of edges. Each vertex keeps the weight of the easiest way to go there from the current MST set. This allows us to split the inner loop in two. One finds the best next vertex by iterating over all vertices. It picks the best adjacent one because non-adjacent have infinite weights. That's exactly what the confusing line does. It's O(n).

The other inner loop updates the weights. However, since updates may be only caused by the addition of the new vertex, only edges adjacent to that particular vertices need to be considered! That's O(n) again (assuming a nearly complete graph). Thus you bring the complexity down to O(n^2) for all three loops.

As a matter of fact, if you use adjacency matrix, it doesn't even matter whether the graph complete or not. To get rid of that minKey part, you'd need to make three nested loops: the first inner one would iterate through all vertices in the MST set, the innermost one would iterate through its adjacent edges including non-existent ones. The minKey trick allows to have only this:

    // Update key value and parent index of the adjacent vertices of
    // the picked vertex. Consider only those vertices which are not yet
    // included in MST
    for (int v = 0; v < V; v++) // note there is no loop over `u`!