Algorithm-Aware Side-Channel Analysis Framework for CRYSTALS-Kyber
Aaron Schnacky Independent Researcher, USA
Abstract
Existing side-channel analysis tools such as Valgrind, Cachegrind, and ChipWhisperer provide generalized instrumentation capable of observing timing, memory, and power behavior across arbitrary programs. However, these tools lack semantic awareness of the cryptographic algorithms they instrument, limiting their ability to contextualize observations meaningfully. This paper proposes the design and implementation of an algorithm-aware side-channel analysis framework specifically targeting CRYSTALS-Kyber, a NIST-standardized post-quantum key encapsulation mechanism. By embedding structural knowledge of Kyber's polynomial arithmetic, noise sampling, and encoding operations directly into the analysis harness, this framework can produce precise, actionable vulnerability reports mapped to specific operations and code paths — functionality that general-purpose tools cannot provide.
1. Introduction
The standardization of CRYSTALS-Kyber (now designated ML-KEM under FIPS 203) marks a pivotal shift in cryptographic infrastructure. As deployment accelerates across TLS, secure messaging, and government systems, the correctness of Kyber implementations becomes a matter of national security concern. Side-channel vulnerabilities — particularly timing and cache-based attacks — have historically undermined cryptographic implementations even when the underlying algorithm is mathematically sound.
Current tooling for detecting such vulnerabilities falls into two categories:
General-purpose dynamic analysis tools (Valgrind, Cachegrind, Perf) that observe execution without algorithmic context
Hardware-level measurement platforms (ChipWhisperer) that require physical access and specialized equipment
Neither category addresses the need for a software-based, algorithm-aware harness that can be run in CI/CD pipelines, evaluated by independent researchers, and tuned specifically for post-quantum primitive behavior. This gap motivates the development of a dedicated framework.
2. Background
2.1 CRYSTALS-Kyber Overview
Kyber is a lattice-based key encapsulation mechanism built on the Module Learning With Errors (MLWE) problem. Its core operations include:
Polynomial arithmetic over the ring $\mathbb{Z}_q[x]/(x^n + 1)$, where $q = 3329$ and $n = 256$
Noise sampling via a centered binomial distribution
Number Theoretic Transform (NTT) for efficient polynomial multiplication
Compression and encoding of polynomial coefficients
Constant-time behavior is required across all secret-dependent branches and memory accesses. Any deviation in timing, cache access pattern, or branch prediction correlated with secret data constitutes a potential side-channel leak.
2.2 Limitations of Existing Tools
Tool
Approach
Limitation
Valgrind/Memcheck
Memory access tracking
No cryptographic context
Cachegrind
Cache simulation
Cannot distinguish secret vs. public dependent access
ChipWhisperer
Power/EM measurement
Requires hardware; not CI-compatible
dudect
Statistical timing
Generic; no operation-level mapping
ctgrind
Constant-time checking
Taint analysis only; no Kyber-specific heuristics
The absence of algorithm-specific context means these tools either produce noisy, hard-to-interpret output or require substantial manual work to trace observations back to meaningful operations.
3. Proposed Framework
3.1 Design Philosophy
The proposed framework, tentatively named KyberScope, is built on three principles:
Semantic awareness — the tool understands Kyber's operational structure and knows which code regions must be constant-time
Statistical rigor — findings are supported by large sample differential analysis, not single observations
Actionable output — vulnerability reports map directly to operations, source locations, and exploitability assessments
3.2 Architecture
KyberScope is composed of four modules:
Module 1: Instrumented Kyber Runtime A Rust implementation of Kyber with fine-grained hooks inserted at each logical operation boundary — polynomial multiplication, NTT transforms, noise sampling, encoding/decoding. Each hook captures timing (via CPU performance counters using the rdtsc instruction), L1/L2 cache miss counts, and branch prediction outcomes.
Module 2: Input Corpus Generator Generates structured input sets designed to exercise secret-dependent code paths. Inputs are crafted to produce known secret values against which timing and memory observations are correlated. This includes:
Key generation with known seed values
Encapsulation with crafted ciphertexts targeting specific polynomial coefficients
Decapsulation inputs designed to probe noise sampling paths
Module 3: Differential Statistical Engine Runs each input class thousands of times, collecting measurement distributions. Applies:
Welch's t-test for detecting timing distribution divergence between secret classes
Pearson correlation between timing traces and Hamming weight of secret values
Leakage quantification using Mutual Information Analysis (MIA)
This is methodologically similar to Test Vector Leakage Assessment (TVLA) but scoped to Kyber's specific operational boundaries.
Module 4: Contextual Report Generator Maps statistically significant findings back to:
Named Kyber operations (e.g., "NTT butterfly at layer 3")
Source file and line number
An exploitability severity rating (Informational / Low / Medium / High / Critical)
Suggested remediation (e.g., "replace conditional branch with constant-time selection")
3.3 Implementation Stack
Language: Rust (primary), with optional C FFI bindings to test reference C implementations
Performance counters: perf_event_open syscall on Linux; cross-platform abstraction via the perf crate
Statistical computation: statrs crate for distributions and hypothesis testing
Report output: Structured JSON and optional human-readable Markdown
4. Research Objectives
The primary research questions this framework is designed to answer:
Do widely-used Kyber implementations exhibit measurable timing variation in their noise sampling routines?
Are NTT implementations consistent with constant-time requirements across all input classes?
Do compression and decoding routines introduce secret-dependent memory access patterns?
How do findings differ across implementations (reference C, Rust, assembly-optimized variants)?
5. Significance and Novelty
This framework is, to the author's knowledge, the first proposed tool to combine:
Kyber-specific operational semantic mapping
Statistical leakage quantification
Automated source-level vulnerability attribution
A Rust-native implementation suitable for modern secure development pipelines
General tools miss vulnerabilities because they don't know what to look for. KyberScope knows exactly what constant-time behavior looks like in Kyber — and knows exactly when it isn't there.
At a time when NIST PQC standards are entering production deployment, independent auditing infrastructure of this kind represents a meaningful contribution to the public cryptographic security posture of the United States and its allies.
6. Future Directions
Extension to CRYSTALS-Dilithium (ML-DSA under FIPS 204)
Hardware-in-the-loop mode integrating ChipWhisperer measurement alongside software instrumentation
Formal publication of discovered vulnerabilities through responsible disclosure
Potential contribution to the NIST Cryptographic Algorithm Validation Program (CAVP) testing infrastructure
References
Bos, J. et al. (2018). CRYSTALS-Kyber: A CCA-Secure Module-Lattice-Based KEM. IEEE EuroS&P.
NIST. (2024). FIPS 203: Module-Lattice-Based Key-Encapsulation Mechanism Standard.
Kocher, P. (1996). Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS, and Other Systems. CRYPTO.
Becker, G. et al. (2013). Test Vector Leakage Assessment (TVLA) Methodology in Practice. IACR.
Reparaz, O. et al. (2015). Dude, is my code constant time? IACR ePrint.
This document represents an original research proposal for independent security research. All implementation and vulnerability disclosure will adhere to responsible disclosure standards.