HN
Today

Linux Internals: How /proc/self/mem writes to unwritable memory (2021)

This deep dive exposes how Linux's /proc/self/mem file can write to seemingly unwritable memory, a counter-intuitive behavior essential for tools like Julia's JIT compiler. It meticulously unpacks the kernel's clever bypasses of hardware-enforced memory protections, appealing directly to HN's appetite for low-level system intricacies. The article highlights that memory permissions are a virtual construct, easily re-engineered by an all-powerful kernel.

12
Score
3
Comments
#8
Highest Rank
11h
on Front Page
First Seen
Mar 9, 12:00 AM
Last Seen
Mar 9, 10:00 AM
Rank Over Time
11109889912131313

The Lowdown

The /proc/*/mem pseudofile in Linux exhibits a peculiar "punch through" semantic, allowing writes to virtual memory even if marked unwritable. This isn't a bug but an intentional feature, actively utilized by projects such as the Julia JIT compiler and the rr debugger.

  • The article demonstrates this behavior with a C++ example, successfully writing to a read-only memory page and patching a libc function (getchar) to trigger a SIGTRAP upon execution.
  • It investigates hardware memory controls on x86-64, specifically the CR0.WP (Write Protect) and CR4.SMAP (Supervisor Mode Access Prevention) bits, which govern the kernel's ability to access memory.
  • While CR0.WP inhibits supervisor-level procedures from writing to read-only pages when set (which it generally is), the kernel's implementation of /proc/*/mem effectively sidesteps these hardware constraints.
  • The core mechanism involves mem_rw() calling access_remote_vm(), which performs three key steps:
    • get_user_pages_remote(): Translates the user-space virtual address to its physical frame, using the FOLL_FORCE flag to bypass write permission checks during page table walks.
    • kmap(): Maps the identified physical frame into the kernel's own virtual address space, crucially with writable permissions.
    • copy_to_user_page(): Executes the write as a simple memcpy to this newly mapped, writable kernel address.
  • This process reveals that the kernel doesn't adhere to the user-space memory permissions because it's under no obligation to access memory via the user-provided pointer. Instead, it leverages its complete control over the virtual memory subsystem to remap the physical frame into its own writable space.
  • Ultimately, the article concludes that memory permissions are tied to the virtual address used for access, not the physical frame. Hardware constraints like CR0.WP are merely "superficial roadblocks" that the kernel, being in charge, can elegantly work around or entirely sidestep by controlling virtual memory mappings.