Chapter 35
HomeAfter each number in a stream, output the median so far. The naive approach re-sorts every time — O(N²). With two heaps, it's O(N log N) and one of the most elegant patterns in competitive programming.
Given a stream of numbers, after each insertion, output the current median. If there's an even count, output the lower median (the smaller of the two middle values).
Stream: 5 15 1 3 After 5: median = 5 After 15: median = 5 (sorted: [5, 15]) After 1: median = 5 (sorted: [1, 5, 15]) After 3: median = 3 (sorted: [1, 3, 5, 15], even → lower = 3)
The naive approach: maintain a sorted list. Each insertion requires finding the right position (O(N) binary search) then shifting elements (O(N)). That's O(N²) total.
vector<double> running_median_naive(vector<int>& stream) {
vector<int> sorted;
vector<double> medians;
for (int x : stream) {
// Find insertion point (O(log N) — binary search)
auto pos = lower_bound(sorted.begin(), sorted.end(), x);
// Shift elements right (O(N))
sorted.insert(pos, x);
int n = sorted.size();
if (n % 2 == 1) {
medians.push_back(sorted[n / 2]);
} else {
medians.push_back(sorted[n / 2 - 1]); // lower median
}
}
return medians;
}
fn running_median_naive(stream: &[i32]) -> Vec<f64> {
let mut sorted: Vec<i32> = Vec::new();
let mut medians = Vec::new();
for &x in stream {
let pos = sorted.binary_search(&x).unwrap_or_else(|e| e);
sorted.insert(pos, x); // O(N) shift
let n = sorted.len();
if n % 2 == 1 {
medians.push(sorted[n / 2] as f64);
} else {
medians.push(sorted[n / 2 - 1] as f64); // lower
}
}
medians
}
🚨 O(N²) is the cost of shifting
vector::insert / Vec::insert shifts every element
after the insertion point right by one. For a stream of 10⁵ elements, that's
~5 × 10⁹ memory moves. A heap-based approach avoids shifting entirely.
The insight: the median divides the data into two halves. A max-heap stores the lower half (largest on top), and a min-heap stores the upper half (smallest on top). The median is always at the top of one of the heaps.
vector<double> running_median(vector<int>& stream) {
priority_queue<int> lower; // max-heap
priority_queue<int, vector<int>, greater<int>> upper; // min-heap
vector<double> medians;
for (int x : stream) {
// Insert: always push to lower first
lower.push(x);
// Balance: lower's largest must ≤ upper's smallest
if (!upper.empty() && lower.top() > upper.top()) {
upper.push(lower.top());
lower.pop();
}
// Size balance: lower can have at most one extra element
if (lower.size() > upper.size() + 1) {
upper.push(lower.top());
lower.pop();
} else if (upper.size() > lower.size()) {
lower.push(upper.top());
upper.pop();
}
// Median is top of lower (or average if equal size — we output lower)
medians.push_back(lower.top());
}
return medians;
}
use std::collections::BinaryHeap;
use std::cmp::Reverse;
fn running_median(stream: &[i32]) -> Vec<f64> {
let mut lower: BinaryHeap<i32> = BinaryHeap::new(); // max-heap
let mut upper: BinaryHeap<Reverse<i32>> = BinaryHeap::new(); // min-heap
let mut medians = Vec::new();
for &x in stream {
lower.push(x);
// Ensure lower's top ≤ upper's top
if let Some(&Reverse(top_upper)) = upper.peek() {
if lower.peek().unwrap() > &top_upper {
upper.push(Reverse(*lower.peek().unwrap()));
lower.pop();
}
}
// Balance sizes
if lower.len() > upper.len() + 1 {
upper.push(Reverse(*lower.peek().unwrap()));
lower.pop();
} else if upper.len() > lower.len() {
lower.push(upper.peek().unwrap().0);
upper.pop();
}
medians.push(*lower.peek().unwrap() as f64);
}
medians
}
💡 Why this works
The lower heap contains the smaller half, the upper heap the larger half.
lower.top() is the largest element of the lower half — the median
when counts are odd, or the lower median when even. Balancing the sizes ensures
the median is always accessible in O(1).