Chapter 78
HomePrim's algorithm builds a minimum spanning tree by growing it from a single starting node, always adding the cheapest edge that connects the growing tree to a node not yet in it. It's like Dijkstra, but instead of minimising distance from source, you minimise edge weight to the tree.
Imagine you're building a road network connecting several villages. You start at any village and mark it as "connected." Now look at all roads from your connected villages to unconnected ones — pick the cheapest road and build it, bringing that village into the network. Repeat until every village is connected.
At every step, the algorithm maintains two sets of nodes: those already in the MST (the "tree" set) and those outside it. It considers every edge that crosses from the tree set to the outside set, and picks the cheapest one.
💡 The core invariant
At every step, the tree-in-progress is always a subtree of some minimum spanning tree. The PQ doesn't store all edges — it only stores edges that connect the current tree to nodes outside it. When an edge is popped, if its endpoint is already in the tree, we skip it (stale entry). Otherwise, we add it and push all its outgoing edges that lead outside the tree.
Consider this graph with 5 nodes. Edge weights are shown on the connections. We'll run Prim starting at node 0.
Step 0: MST = {0}. PQ = [(1→3 at cost 1), (0→1 at cost 4), (0→2 at cost 2)]
Step 1: Pop (0→3 at cost 1). Add 3. MST = {0,3}.
Push 3's edges: (3→2 at cost 5).
PQ = [(0→2 at cost 2), (0→1 at cost 4), (3→2 at cost 5)]
Step 2: Pop (0→2 at cost 2). Add 2. MST = {0,2,3}.
Push 2's edges: (2→1 at cost 3).
PQ = [(2→1 at cost 3), (0→1 at cost 4), (3→2 ✗ stale because 2 is in tree)]
Step 3: Pop (2→1 at cost 3). Add 1. MST = {0,1,2,3}.
All 4 nodes added. Total weight = 1 + 2 + 3 = 6.
That's the MST!
📐 Stale entries don't hurt correctness
Notice edge 3→2 (cost 5) was still in the PQ when node 2 was added via a
cheaper path (0→2 at cost 2). When (3→2) is eventually popped, we check
in_mst[2], see it's already true, and skip it. This is exactly
the same stale-entry pattern as Dijkstra — the PQ may hold outdated entries,
but the in_mst check filters them safely.
A cut is a partition of the graph's nodes into two sets. Any edge that connects a node in one set to a node in the other is a crossing edge of that cut. The cut property states:
The Cut Property
For any cut, the minimum-weight crossing edge belongs to every minimum spanning tree.
Prim's algorithm maintains a cut where one side is "nodes in the MST so far" and the other is "nodes not yet in the MST." At each step, Prim picks the cheapest crossing edge — and by the cut property, that edge must be in the MST. This is why greedy works: every choice is forced to be correct by the cut property.
📐 Proof sketch (exchange argument)
Suppose an MST exists that does not include the cheapest crossing edge e for the current cut. Add e to that MST — this creates a cycle. The cycle must include another crossing edge f (because any cycle crossing a cut crosses it an even number of times). Since f is more expensive than e, replacing f with e gives a lighter spanning tree. Contradiction. So every MST must include e.
The code looks almost identical. Here's what makes them different:
Dijkstra tracks distance from source
PQ stores (dist_from_source, node). When you pop, you check
if d != dist[u] (stale distance). You relax by potentially
decreasing the neighbour's stored distance.
Prim tracks whether a node is in the tree
PQ stores (edge_weight, node). When you pop, you check
if in_mst[u] (stale node). You add all outgoing edges to the
PQ — there's no "decrease-key" because you never need to update a weight,
you just push a new entry and let the stale one be ignored later.
// Dijkstra checks: if (d != dist[u]) continue;
// Prim checks: if (in_mst[u]) continue;
// Dijkstra relaxes: dist[v] = min(dist[v], dist[u] + w)
// Prim adds edges: pq.push({w, v}) — no min-check needed
The code is short. Make sure you understand the in_mst skip
and why we push every outgoing edge from newly added nodes.
int prim(vector>>& adj) { int V = adj.size(); vector in_mst(V, false); priority_queue , vector >, greater<>> pq; pq.push({0, 0}); // (weight, node) — start at node 0 int total = 0, added = 0; while (!pq.empty() && added < V) { auto [w, u] = pq.top(); pq.pop(); if (in_mst[u]) continue; // stale entry — skip in_mst[u] = true; // add to tree total += w; added++; for (auto [v, w2] : adj[u]) if (!in_mst[v]) pq.push({w2, v}); // no decrease-key needed } return total; }
use std::collections::BinaryHeap;
use std::cmp::Reverse;
fn prim(adj: &[Vec<(usize, i32)>]) -> i32 {
let v = adj.len();
let mut in_mst = vec![false; v];
let mut pq = BinaryHeap::new();
pq.push(Reverse((0, 0)));
let mut total = 0;
let mut added = 0;
while let Some(Reverse((w, u))) = pq.pop() {
if in_mst[u] { continue; }
in_mst[u] = true;
total += w;
added += 1;
if added == v { break; }
for &(v, w2) in &adj[u] {
if !in_mst[v] { pq.push(Reverse((w2, v))); }
}
}
total
}
💡 When to use which
Prim is better for dense graphs (E ≈ V²) because it processes nodes, not edges — O((V+E) log V). If the graph is given as an adjacency list, Prim is natural. Kruskal is simpler for sparse graphs and when edges are given as a flat list — O(E log V) dominated by sorting. Both produce the correct MST. Pick whichever fits the input format.
in_mst[] instead of dist[].