Chapter 91
HomeNaive matrix multiplication is O(K³). For Fibonacci (K=2) that's fine. For K=100, it's 10⁶ operations per multiply. Cache-efficient loop ordering (i-k-j) gives 2-3× speedup. Skip zeros for sparse matrices.
for (int i = 0; i < n; i++)
for (int j = 0; j < n; j++)
for (int k = 0; k < n; k++) // B[k][j] is not contiguousMatrix mul(Matrix& A, Matrix& B) {
int n = A.size();
Matrix C(n, vector(n, 0));
for (int i = 0; i < n; i++) {
for (int k = 0; k < n; k++) {
if (A[i][k] == 0) continue; // skip zeros
for (int j = 0; j < n; j++)
C[i][j] = (C[i][j] + A[i][k] * B[k][j]) % MOD;
}
}
return C;
}
// i-k-j: B[k][j] accessed row-wise → cache hits
// Zero-skip: avoids unnecessary multiplications