Fooling around with encrypted reasoning blobs

Matthew Green, a cryptographer, embarked on a weekend hobby project that led him down a rabbit hole of encrypted reasoning blocks within LLM APIs. After encountering an intriguing error message while configuring an OpenClaw agent, he discovered that both OpenAI and Anthropic send hidden 'chain-of-thought' (CoT) data, often encrypted, to the client.

Encrypted Reasoning Discovery: Green found that LLM APIs, beyond standard prompts and responses, transmit encrypted 'thinking' or 'reasoning' fields as Base64-encoded JSON blobs, which are meant to be opaque and replayed back to the server.
Motivation for Transmission: These blobs facilitate stateless API conversations, allowing client applications to carry forward hidden model state that the provider can later decrypt and verify.
Replay Attacks Demonstrated: Despite the encryption, Green showed that unmodified reasoning blobs could be replayed across different sessions and even different user accounts, indicating the use of a single global key. Crucially, these replayed blobs were sometimes 'semantically active,' influencing model behavior.
Side Channel Leakage: He successfully demonstrated a side channel attack where secret information, even if explicitly instructed not to be revealed, could be inferred by observing metadata like the size of the encrypted blob, token counts, or wall-clock response times, depending on the complexity of internal computations tied to the secret.
Limits and Challenges: Attempts to extract high-level system prompts were unsuccessful, partly due to models hallucinating or not having such prompts in API mode. The process was slow and required alternative LLMs for ethical reasons.
Recommendations for Providers: Green suggested that providers improve key management to prevent cross-session replays and consider implementing policy gates for model reasoning to mitigate side channel vulnerabilities.

While not presenting a catastrophic vulnerability, Green's work highlights subtle yet significant security considerations surrounding the internal workings and data handling of advanced LLM APIs. It serves as a call for providers to re-evaluate the implications of exposing encrypted model state, even if unreadable, to client applications.

Fooling around with encrypted reasoning blobs

The Lowdown