Chapter 78 — Minimum Spanning Tree: Prim

How Prim Thinks

Imagine you're building a road network connecting several villages. You start at any village and mark it as "connected." Now look at all roads from your connected villages to unconnected ones — pick the cheapest road and build it, bringing that village into the network. Repeat until every village is connected.

At every step, the algorithm maintains two sets of nodes: those already in the MST (the "tree" set) and those outside it. It considers every edge that crosses from the tree set to the outside set, and picks the cheapest one.

💡 The core invariant

At every step, the tree-in-progress is always a subtree of some minimum spanning tree. The PQ doesn't store all edges — it only stores edges that connect the current tree to nodes outside it. When an edge is popped, if its endpoint is already in the tree, we skip it (stale entry). Otherwise, we add it and push all its outgoing edges that lead outside the tree.

Walkthrough: A Concrete Example

Consider this graph with 5 nodes. Edge weights are shown on the connections. We'll run Prim starting at node 0.

Step 0:  MST = {0}.   PQ = [(1→3 at cost 1), (0→1 at cost 4), (0→2 at cost 2)]
Step 1:  Pop (0→3 at cost 1). Add 3.  MST = {0,3}.
         Push 3's edges:  (3→2 at cost 5).
         PQ = [(0→2 at cost 2), (0→1 at cost 4), (3→2 at cost 5)]
Step 2:  Pop (0→2 at cost 2). Add 2.  MST = {0,2,3}.
         Push 2's edges:  (2→1 at cost 3).
         PQ = [(2→1 at cost 3), (0→1 at cost 4), (3→2 ✗ stale because 2 is in tree)]
Step 3:  Pop (2→1 at cost 3). Add 1.  MST = {0,1,2,3}.
         All 4 nodes added. Total weight = 1 + 2 + 3 = 6.
         That's the MST!

📐 Stale entries don't hurt correctness

Notice edge 3→2 (cost 5) was still in the PQ when node 2 was added via a cheaper path (0→2 at cost 2). When (3→2) is eventually popped, we check in_mst[2], see it's already true, and skip it. This is exactly the same stale-entry pattern as Dijkstra — the PQ may hold outdated entries, but the in_mst check filters them safely.

Why Greedy Works: The Cut Property

A cut is a partition of the graph's nodes into two sets. Any edge that connects a node in one set to a node in the other is a crossing edge of that cut. The cut property states:

The Cut Property

For any cut, the minimum-weight crossing edge belongs to every minimum spanning tree.

Prim's algorithm maintains a cut where one side is "nodes in the MST so far" and the other is "nodes not yet in the MST." At each step, Prim picks the cheapest crossing edge — and by the cut property, that edge must be in the MST. This is why greedy works: every choice is forced to be correct by the cut property.

📐 Proof sketch (exchange argument)

Suppose an MST exists that does not include the cheapest crossing edge e for the current cut. Add e to that MST — this creates a cycle. The cycle must include another crossing edge f (because any cycle crossing a cut crosses it an even number of times). Since f is more expensive than e, replacing f with e gives a lighter spanning tree. Contradiction. So every MST must include e.

Prim vs Dijkstra: Spot the Difference

The code looks almost identical. Here's what makes them different:

Dijkstra tracks distance from source

PQ stores (dist_from_source, node). When you pop, you check if d != dist[u] (stale distance). You relax by potentially decreasing the neighbour's stored distance.

Prim tracks whether a node is in the tree

PQ stores (edge_weight, node). When you pop, you check if in_mst[u] (stale node). You add all outgoing edges to the PQ — there's no "decrease-key" because you never need to update a weight, you just push a new entry and let the stale one be ignored later.

// Dijkstra checks:   if (d != dist[u]) continue;
// Prim checks:        if (in_mst[u]) continue;

// Dijkstra relaxes:   dist[v] = min(dist[v], dist[u] + w)
// Prim adds edges:    pq.push({w, v}) — no min-check needed

Implementation

The code is short. Make sure you understand the in_mst skip and why we push every outgoing edge from newly added nodes.

C++ — Prim's MST

int prim(vector>>& adj) {
    int V = adj.size();
    vector in_mst(V, false);
    priority_queue, vector>, greater<>> pq;
    pq.push({0, 0});  // (weight, node) — start at node 0

    int total = 0, added = 0;

    while (!pq.empty() && added < V) {
        auto [w, u] = pq.top(); pq.pop();
        if (in_mst[u]) continue;        // stale entry — skip

        in_mst[u] = true;               // add to tree
        total += w;
        added++;

        for (auto [v, w2] : adj[u])
            if (!in_mst[v])
                pq.push({w2, v});       // no decrease-key needed
    }
    return total;
}

Rust — Prim's

use std::collections::BinaryHeap;
use std::cmp::Reverse;

fn prim(adj: &[Vec<(usize, i32)>]) -> i32 {
    let v = adj.len();
    let mut in_mst = vec![false; v];
    let mut pq = BinaryHeap::new();
    pq.push(Reverse((0, 0)));

    let mut total = 0;
    let mut added = 0;

    while let Some(Reverse((w, u))) = pq.pop() {
        if in_mst[u] { continue; }
        in_mst[u] = true;
        total += w;
        added += 1;
        if added == v { break; }
        for &(v, w2) in &adj[u] {
            if !in_mst[v] { pq.push(Reverse((w2, v))); }
        }
    }
    total
}

Prim vs Kruskal

💡 When to use which

Prim is better for dense graphs (E ≈ V²) because it processes nodes, not edges — O((V+E) log V). If the graph is given as an adjacency list, Prim is natural. Kruskal is simpler for sparse graphs and when edges are given as a flat list — O(E log V) dominated by sorting. Both produce the correct MST. Pick whichever fits the input format.

Key Takeaways

→Prim grows the tree greedily: always add the cheapest crossing edge.
→The cut property guarantees correctness — the cheapest crossing edge must be in every MST.
→Same PQ pattern as Dijkstra, but with in_mst[] instead of dist[].
→O((V+E) log V). Better for dense graphs. Kruskal for sparse.