KyberSlash: division timings depending on secrets in Kyber software

For KyberSlash1, what is the problematic code?

Here's the critical line in the Kyber reference software:

t  = (((t << 1) + KYBER_Q/2)/KYBER_Q) & 1;

In context, KYBER_Q is a public integer 3329, but t is secret. This line doubles t, adds 1664, divides by 3329, and extracts the bottom bit of the result.

As another example, here's the critical line in another library:

t = (int) ((((((int) (a[8 * i + j])) << 1) + (KyberParams.paramsQ / 2)) / KyberParams.paramsQ) & 1);

What is the problem with this code?

In many environments, the time taken by division depends on the inputs to the division: in this case, on the secret t (or the secret a[8 * i + j] in the second example). This creates data flow from secrets to the time taken by Kyber.

Why are secret-dependent timings important?

Timing attacks often successfully work backwards from timings to secrets.

Is this particular dependency exploitable?

The first version of this FAQ (19 December 2023) said the following: "Maybe. Patch now; don't wait to see whether an exploit is demonstrated."

On 30 December 2023, Daniel J. Bernstein posted a demo exploiting KyberSlash1 to often recover Kyber's complete secret key from dec timings of the end-of-November-2023 Kyber reference code running under Raspbian (gcc 8.3.0) on a Raspberry Pi 2. This demo succeeded twice in three experiments.

If a Kyber key is used only once then everything is okay?

Not necessarily. This does block the first demo, but there are previous examples of side channels that were eventually shown to allow single-trace attacks, and the same might happen to KyberSlash. Patch now.

For KyberSlash2, what is the problematic code?

Here are the critical lines for KyberSlash2 in the Kyber reference software (not adjacent to each other):

t[j] = ((((uint16_t)u << 4) + KYBER_Q/2)/KYBER_Q) & 15;

t[j] = ((((uint32_t)u << 5) + KYBER_Q/2)/KYBER_Q) & 31;

t[k]  = ((((uint32_t)t[k] << 11) + KYBER_Q/2)/KYBER_Q) & 0x7ff;

t[k]  = ((((uint32_t)t[k] << 10) + KYBER_Q/2)/ KYBER_Q) & 0x3ff;

Similar lines appear in various other Kyber libraries.

What is the structural difference between KyberSlash1 and KyberSlash2?

The KyberSlash1 division is called from decryption in Kyber's underlying PKE (e.g., indcpa_dec in the Kyber reference software uses poly_tomsg). The KyberSlash2 divisions are called from encryption in Kyber's underlying PKE (e.g., indcpa_enc in the Kyber reference software uses poly_compress and polyvec_compress). Decapsulation in the Kyber KEM calls both decryption and encryption in the underlying PKE, so both of these use secret inputs.

Does a division in C, C++, Go, Rust, etc. always turn into a division instruction in machine language?

Not necessarily. There are some important combinations of CPUs, compilers, and compiler options that will convert divisions with public denominators into multiplications. For example, on Intel/AMD CPUs, gcc -Os (size optimization) will produce division instructions, but gcc -O3 (time optimization) will produce multiplication instructions. On RISC-V CPUs, gcc -Os and gcc -O3 will both produce division instructions. Matt Godbolt's Compiler Explorer is an easy way to see what different compilers will do.

There are also environments where the ABI does not guarantee the presence of division instructions, so a division is converted into a call to a "soft division" function. In the Raspberry Pi 2 demo exploiting KyberSlash to recover the Kyber secret key, gcc -Os ends up calling an internal divsi3 function; that function uses branches depending on the inputs.

Does a division instruction in machine language really vary in timing across the small integers used in Kyber?

Not necessarily, but experiments on various CPUs have shown variations. For example, on AMD Zen 2, division slows down by a cycle when the numerator reaches 8192. As another example, on the SiFive U74 RISC-V CPU, division speeds up by 62 cycles when the numerator reaches 2048, then slows down by a cycle when the numerator reaches 4096, then slows down by a cycle when the numerator reaches 8192.

For the KyberSlash1 code sequence, the numerator for attacker-chosen ciphertexts can be any even number between 1664 and 8320. The numerator for legitimate ciphertexts tends to be close to 1664, 4993, or 8320. For the KyberSlash2 code sequences, the numerators have a wider range, generally producing larger timing variations.

Can timing attacks really work with timing variations as small as one or two cycles?

Yes. For example, CacheBleed demonstrated exploitability of cache-bank conflicts. Each cache-bank conflict creates a single-cycle delay.

In the Raspberry Pi 2 demo exploiting KyberSlash, the timing variations are larger, above 20 cycles.

If the compiler turns divisions into multiplications then everything is okay?

Not necessarily. There are CPUs where multiplications take variable time, often in undocumented ways. This is a general issue for lattice-based cryptography, and for various other forms of cryptography, including elliptic-curve cryptography; see, e.g., the warning about the IBM PowerPC RS64 IV in the Curve25519 paper.

Are there tools that scan code for divisions?

Yes, although they are not necessarily comprehensive. For example, saferewrite automatically reports warning-div and warning-mul for any secret divisions and secret multiplications detected in machine code. There are more tools that check for secret branches and secret array indices.

Who discovered KyberSlash?

On 1 December 2023, the pq-crystals/kyber repository was patched to remove the KyberSlash1 code. The commit message stated the following: "Updated poly_tomsg to prevent a compiler from using DIV. Thanks to Goutam Tamvada, Karthikeyan Bhargavan, and Franziskus Kiefer @cryspen for pointing this out". This was not labeled or announced as a security issue.

Bernstein independently noticed divisions in Kyber code on 14 December 2023 as part of analyzing the reference code that various KEM design teams had submitted to the centralized SUPERCOP test framework. SUPERCOP takes code from anybody, is not a library, and does not make any security guarantees (for example, SUPERCOP benchmarks MD5, SIKE, and code submitted by NSA), but code in SUPERCOP often reflects code in libraries. Bernstein announced divisions in Kyber software on 15 December 2023 as a "possibly exploitable" security issue, saying that the divisions were taking secret inputs in "at least one case", namely the KyberSlash1 case.

The original Kyber software author asked "did you check that the code executes in variable time on any particular CPU for the input ranges that are fed into the DIV instruction (if a compiler issues a DIV)?"; Bernstein posted various examples of such CPUs.

On 30 December 2023, Prasanna Ravi and Matthias Kannwischer announced that further divisions in the Kyber reference code also take secret inputs. These are the KyberSlash2 divisions.

What does "Slash" mean in KyberSlash?

Mathematicians and programmers typically write division using a slash symbol.

Does KyberSlash have a logo?

No. The obvious choice of logo would be the name Kyber in a circle with a slash through it to represent division, but this could be misinterpreted.

Version: This is version 2024.01.08 of the "FAQ" web page.