Linux Internals: How /proc/self/mem writes to unwritable memory (2021)
This deep dive exposes how Linux's /proc/self/mem file can write to seemingly unwritable memory, a counter-intuitive behavior essential for tools like Julia's JIT compiler. It meticulously unpacks the kernel's clever bypasses of hardware-enforced memory protections, appealing directly to HN's appetite for low-level system intricacies. The article highlights that memory permissions are a virtual construct, easily re-engineered by an all-powerful kernel.
12
Score
3
Comments
#8
Highest Rank
11h
on Front Page
First Seen
Mar 9, 12:00 AM
Last Seen
Mar 9, 10:00 AM
Rank Over Time
The Lowdown
The /proc/*/mem pseudofile in Linux exhibits a peculiar "punch through" semantic, allowing writes to virtual memory even if marked unwritable. This isn't a bug but an intentional feature, actively utilized by projects such as the Julia JIT compiler and the rr debugger.
- The article demonstrates this behavior with a C++ example, successfully writing to a read-only memory page and patching a
libcfunction (getchar) to trigger aSIGTRAPupon execution. - It investigates hardware memory controls on x86-64, specifically the
CR0.WP(Write Protect) andCR4.SMAP(Supervisor Mode Access Prevention) bits, which govern the kernel's ability to access memory. - While
CR0.WPinhibits supervisor-level procedures from writing to read-only pages when set (which it generally is), the kernel's implementation of/proc/*/memeffectively sidesteps these hardware constraints. - The core mechanism involves
mem_rw()callingaccess_remote_vm(), which performs three key steps:get_user_pages_remote(): Translates the user-space virtual address to its physical frame, using theFOLL_FORCEflag to bypass write permission checks during page table walks.kmap(): Maps the identified physical frame into the kernel's own virtual address space, crucially with writable permissions.copy_to_user_page(): Executes the write as a simplememcpyto this newly mapped, writable kernel address.
- This process reveals that the kernel doesn't adhere to the user-space memory permissions because it's under no obligation to access memory via the user-provided pointer. Instead, it leverages its complete control over the virtual memory subsystem to remap the physical frame into its own writable space.
- Ultimately, the article concludes that memory permissions are tied to the virtual address used for access, not the physical frame. Hardware constraints like
CR0.WPare merely "superficial roadblocks" that the kernel, being in charge, can elegantly work around or entirely sidestep by controlling virtual memory mappings.