CVE-2026-31431: Copy Fail vs. rootless containers

A recent kernel privilege escalation, CVE-2026-31431 ('Copy Fail'), allowed an unprivileged user to gain root access. This article meticulously details how the exploit was executed and, more importantly, how rootless Podman containers effectively nullified its impact on the host system. The deep dive into user namespaces and kernel tracing provides a compelling real-world demonstration of container security best practices, captivating HN's technically-minded audience.

Score

Comments

Highest Rank

on Front Page

First Seen

May 5, 4:00 AM

Last Seen

May 5, 11:00 AM

Rank Over Time

The Lowdown

This post meticulously documents an experiment to test the containment capabilities of rootless Podman containers against a real-world kernel Local Privilege Escalation (LPE), CVE-2026-31431, also known as 'Copy Fail'. The author demonstrates how an unprivileged user can achieve root privileges within the container, but crucially, these privileges are prevented from escalating to the host system.

The vulnerability exploits an AF_ALG socket and splice() mechanics to overwrite arbitrary data in the kernel's page cache, specifically targeting /usr/bin/su with a malicious ELF payload.
The custom shellcode, featuring 'ELF golfing' for minimal size, is designed to perform setuid(0) and execve("/bin/sh"), granting root access and a shell.
The exploit successfully corrupts /usr/bin/su in the page cache, leading to the execution of the malicious shellcode and privilege escalation to root inside the container.
However, rootless Podman leverages Linux User Namespaces, which map the container's UID 0 (root) to an unprivileged user ID on the host system (e.g., the podman user's UID 1000).
Using strace and bpftrace, the author verifies that setuid(0) indeed succeeds within the container's namespace but is effectively contained, meaning the 'root' user within the container has no additional privileges on the host.
The uid_map confirms this isolation, showing the container's root is equivalent to the host's unprivileged user, thereby preventing any system-level compromise.

Ultimately, the experiment concludes that rootless containers, through the robust isolation provided by User Namespaces, successfully mitigate the impact of kernel LPEs like 'Copy Fail', demonstrating their critical role in secure application deployment, particularly for shared environments like GitLab runners.

The Gossip

Containment Conundrum

The primary discussion point revolves around whether the exploit constitutes a 'container escape' if it achieves root within the container but is contained by user namespaces. Commenters confirm that the user namespaces effectively prevent escalation to the host, reinforcing the security benefits of rootless containers. The debate touches on whether non-rootless Docker would be more vulnerable to a subsequent chained exploit.

Primitive Persistence Ponderings

Despite the successful containment, several commenters highlight that the underlying kernel vulnerability—the ability to write to the read-only page cache—still functioned. This raises concerns about other potential exploits, especially regarding shared images, copy-on-write mechanisms, and read-only bind mounts, suggesting that the fundamental attack primitive remains a potent threat for different abuse scenarios.

Golfing and General Glee

A brief but interesting thread emerges around the technical term 'ELF golfing' for stripping section headers from ELF binaries to reduce their size. This detail, used by the exploit authors, sparks curiosity among the technical audience, while other comments clarify that the post analyzes an existing CVE rather than revealing a new one, managing expectations.