Chapter 107

Home

Rabin-Karp Rolling Hash

Rabin-Karp uses a rolling hash to compare substrings in O(1). Hash the pattern, then slide a window over the text, updating the hash incrementally. O(N+M) average, O(NM) worst case (hash collisions).

Rabin-Karp Algorithm

C++ — Rabin-Karp (single hash)

const long long BASE = 911382323, MOD = 1e9 + 7;

vector rabin_karp(string& text, string& pat) {
    int n = text.size(), m = pat.size();
    if (m > n) return {};

    // Precompute powers
    vector pow(n + 1, 1);
    for (int i = 1; i <= n; i++) pow[i] = pow[i - 1] * BASE % MOD;

    // Hash the pattern and text prefix
    long long pat_hash = 0, txt_hash = 0;
    for (int i = 0; i < m; i++) {
        pat_hash = (pat_hash * BASE + pat[i]) % MOD;
        txt_hash = (txt_hash * BASE + text[i]) % MOD;
    }

    vector matches;
    if (txt_hash == pat_hash) matches.push_back(0);

    for (int i = m; i < n; i++) {
        // Remove leftmost char, add new rightmost char
        txt_hash = (txt_hash - text[i - m] * pow[m - 1] % MOD + MOD) % MOD;
        txt_hash = (txt_hash * BASE + text[i]) % MOD;

        if (txt_hash == pat_hash) matches.push_back(i - m + 1);
    }
    return matches;
}

// O(N+M) average, O(NM) worst (collisions).
// Use double hash or KMP if collisions are a concern.

Key Takeaways