Kruskal Algorithm

Palavras-chave:

Publicado em: 31/08/2025

Kruskal's Algorithm: Finding the Minimum Spanning Tree

Kruskal's algorithm is a greedy algorithm used to find the Minimum Spanning Tree (MST) for a weighted, undirected graph. An MST connects all vertices in the graph with the minimum possible total edge weight, without forming any cycles. This article provides a detailed explanation and implementation of Kruskal's algorithm.

Fundamental Concepts / Prerequisites

To understand Kruskal's algorithm, familiarity with the following concepts is helpful:

Graph Theory: Understanding of graphs, vertices, edges, weighted graphs, and spanning trees.
Greedy Algorithms: The basic principle of making locally optimal choices at each step to find a global optimum.
Disjoint Set Data Structure (Union-Find): A data structure that keeps track of a set of elements partitioned into a number of disjoint (non-overlapping) subsets. It supports efficient operations for finding which subset an element belongs to and merging subsets together.

Core Implementation

The following C++ code implements Kruskal's algorithm using the Disjoint Set data structure.


#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;

// Structure to represent an edge
struct Edge {
    int src, dest, weight;
};

// Structure to represent a Disjoint Set (Union-Find)
struct DisjointSets {
    int *parent, *rnk;
    int n;

    // Constructor
    DisjointSets(int n) {
        this->n = n;
        parent = new int[n+1];
        rnk = new int[n+1];

        // Initially, all vertices are in different sets
        for (int i = 0; i <= n; i++) {
            rnk[i] = 0;
            parent[i] = i;
        }
    }

    // Find the parent of a node (path compression)
    int find(int u) {
        if (u != parent[u])
            parent[u] = find(parent[u]); // Path compression
        return parent[u];
    }

    // Union by rank
    void unite(int x, int y) {
        x = find(x);
        y = find(y);

        if (rnk[x] > rnk[y])
            parent[y] = x;
        else
            parent[x] = y;

        if (rnk[x] == rnk[y])
            rnk[y]++;
    }
};

// Function to compare edges based on weight
bool compareEdges(const Edge& a, const Edge& b) {
    return a.weight < b.weight;
}

// Kruskal's algorithm implementation
int kruskalMST(vector<Edge>& edges, int numVertices) {
    int mstWeight = 0; // Weight of the MST

    // Sort edges in ascending order by weight
    sort(edges.begin(), edges.end(), compareEdges);

    // Create DisjointSets data structure
    DisjointSets ds(numVertices);

    // Iterate through the sorted edges
    for (const Edge& edge : edges) {
        int src = edge.src;
        int dest = edge.dest;
        int weight = edge.weight;

        // If including this edge doesn't cause a cycle, include it in MST
        if (ds.find(src) != ds.find(dest)) {
            mstWeight += weight;
            ds.unite(src, dest);
        }
    }

    return mstWeight;
}

int main() {
    int numVertices = 4;
    vector<Edge> edges = {
        {0, 1, 10},
        {0, 2, 6},
        {0, 3, 5},
        {1, 3, 15},
        {2, 3, 4}
    };

    int mstWeight = kruskalMST(edges, numVertices);
    cout << "Weight of the Minimum Spanning Tree: " << mstWeight << endl;

    return 0;
}

Code Explanation

The code consists of the following parts:

Edge Structure: Represents an edge in the graph, containing the source vertex (src), destination vertex (dest), and the edge weight (weight).

DisjointSets Structure: Implements the Union-Find data structure. parent array stores the parent of each node in its set. rnk array stores the rank (approximate height) of each tree in the forest. find() finds the representative (root) of the set an element belongs to, and utilizes path compression for optimization. unite() merges two sets together using union by rank, which helps keep the trees relatively flat, resulting in faster find operations.

compareEdges Function: A comparison function used to sort the edges in ascending order based on their weights.

kruskalMST Function:

Initializes the MST weight to 0.
Sorts the edges in ascending order of their weights using the compareEdges function.
Creates a DisjointSets object to track connected components.
Iterates through the sorted edges. For each edge:
- If adding the edge does not create a cycle (checked using ds.find(src) != ds.find(dest)), the edge is added to the MST.
- The weight of the edge is added to the mstWeight.
- The two vertices connected by the edge are merged into the same set using ds.unite(src, dest).
Returns the total weight of the MST.

main Function: Creates a sample graph with edges and their weights. Calls kruskalMST() to compute and print the MST's weight.

Complexity Analysis

Time Complexity:

Sorting the edges: O(E log E), where E is the number of edges.
Iterating through the edges: O(E).
For each edge, performing find and unite operations using Disjoint Sets: Almost O(1) on average, due to path compression and union by rank. More precisely, it's O(α(V)), where α is the very slowly growing inverse Ackermann function, and V is the number of vertices. For all practical purposes, α(V) is considered a constant (less than 5 for any realistically sized graph).

Therefore, the overall time complexity is dominated by the sorting step: O(E log E). Since E can be at most V², O(E log E) is the same as O(E log V).

Space Complexity:

Storing the edges: O(E).
Disjoint Sets data structure (parent and rank arrays): O(V).

Therefore, the overall space complexity is O(E + V).

Alternative Approaches

Prim's Algorithm: Another greedy algorithm for finding the MST. It starts with a single vertex and iteratively adds the minimum-weight edge that connects the current tree to a new vertex. Prim's algorithm generally performs better (especially with dense graphs) when implemented using a binary heap or Fibonacci heap. Its time complexity is O(E log V) with a binary heap and O(E + V log V) with a Fibonacci heap. The main difference from Kruskal's is that Prim's algorithm builds the MST from a single vertex, always maintaining a connected tree, while Kruskal's considers edges in increasing order of weight, possibly building several disconnected trees until the end.

Conclusion

Kruskal's algorithm provides an efficient and intuitive way to find the Minimum Spanning Tree of a graph. It leverages the Disjoint Set data structure to efficiently detect and avoid cycles. Understanding Kruskal's algorithm is fundamental for solving various network optimization problems.