Chapter 106

Home

Refactoring: Z-Algorithm

KMP uses an LPS array computed from the pattern alone. The Z-algorithm computes a Z-array from the combined string pattern + text (with a separator). Z[i] = longest common prefix between s[i..] and s. O(N) — simpler to implement than KMP.

Before: KMP (separate LPS + matching)

C++ — two-pass KMP

auto lps = build_lps(pat);
auto matches = kmp_search(text, pat);

After: Z-Algorithm (single pass on combined string)

C++ — Z-algorithm

vector z_function(string& s) {
    int n = s.size();
    vector z(n, 0);
    for (int i = 1, l = 0, r = 0; i < n; i++) {
        if (i <= r) z[i] = min(r - i + 1, z[i - l]);
        while (i + z[i] < n && s[z[i]] == s[i + z[i]]) z[i]++;
        if (i + z[i] - 1 > r) { l = i; r = i + z[i] - 1; }
    }
    return z;
}

// Pattern matching: concat pat + '$' + text
// Then find all i where z[i] == |pat|
string combined = pat + '$' + text;
auto z = z_function(combined);
for (int i = pat.size() + 1; i < combined.size(); i++)
    if (z[i] == pat.size()) matches.push_back(i - pat.size() - 1);

Rust — Z-algorithm

fn z_function(s: &str) -> Vec {
    let s = s.as_bytes();
    let n = s.len();
    let mut z = vec![0; n];
    let (mut l, mut r) = (0, 0);
    for i in 1..n {
        if i <= r { z[i] = z[i - l].min(r - i + 1); }
        while i + z[i] < n && s[z[i]] == s[i + z[i]] { z[i] += 1; }
        if i + z[i] - 1 > r { l = i; r = i + z[i] - 1; }
    }
    z
}

Key Takeaways