Memory Safe Context Switching
This technical deep dive explores how Fil-C, a memory-safe C variant, tackles the notoriously tricky setjmp/longjmp and ucontext APIs. It details the subtle ways these C features can lead to memory corruption, showcasing Fil-C's innovative approaches to ensure safety, even integrating with its garbage collector. The article resonates with developers grappling with low-level C intricacies and the quest for safer systems programming.
The Lowdown
The article delves into how Fil-C achieves memory safety for C's context-switching primitives: setjmp/longjmp and the ucontext APIs. These functions, while powerful for exception handling and coroutines, are highly prone to misuse in traditional C (dubbed "Yolo-C"), often leading to stack corruption and hard-to-debug crashes. Fil-C aims to eliminate these vulnerabilities.
Making setjmp/longjmp Memory Safe
- The "Returns Twice" Enigma: The core issue with
setjmpis its ability to return twice, complicating compiler optimizations regarding variable storage (registers vs. stack spill slots). Compilers typically apply special handling forsetjmpto prevent incorrect reuse of memory. However, obfuscatingsetjmpcalls can bypass these safeguards, leading to undefined behavior or crashes. - Fil-C's Safeguards: Fil-C implements several mechanisms:
jmp_bufbecomes an opaque pointer to a runtime-managedzjmp_bufobject, preventing direct manipulation.- Compiler-level recognition of
setjmpis enforced to ensure optimizations are correctly inhibited. zjmp_bufobjects are associated with stack frames, allowing validation thatlongjmptargets a valid, active frame. This prevents jumps to deallocated or overwritten stacks.- The runtime saves and restores internal state, including Garbage Collector (GC) roots, ensuring GC correctness across context switches. Acknowledged as a complex area, potential bugs here would manifest as incorrect GC root restoration.
Making ucontext Memory Safe
- The
ucontextAPIs:getcontext,setcontext,makecontext, andswapcontextare used for more structured context switching, particularly for implementing fibers and coroutines. They allow creation of new execution contexts with their own stacks. - Fil-C's Safety Laws for
ucontext:ucontext_tobjects contain an opaquezfiber_contextmanaged by the runtime, disallowing direct user access to internal state.- Fil-C ignores the user-provided stack pointer (
ss_sp), allocating its own internal stack forzfiber_contextbased onss_size, with added padding for stack overflow detection. - A strict state machine for
zfiber_contextobjects prevents illegal transitions (e.g., switching to an uninitialized context or attemptinglongjmp-like behavior). - Thread affinity ensures a
zfiber_contextcannot be used across different threads, preventing cross-thread stack corruption.
- GC Integration Challenges: Integrating
ucontextwith Fil-C's on-the-fly garbage collector posed challenges, particularly concerning "grey fibers"—fibers whose stacks might change state (runnable to running and back) during a GC marking phase. Fil-C's solution involves tracking these grey fibers and re-scanning their stacks as needed to prevent objects from being missed by the GC. This is also identified as a highly intricate area where bugs could lurk.
Conclusion: Fil-C successfully provides memory-safe implementations of longjmp/setjmp and ucontext, supporting even these "depraved features" of C. While setjmp/longjmp support is robust, ucontext support is newer and still undergoing testing.
The Gossip
Devious Depths of `setjmp`/`longjmp`
Commenters largely agreed with the article's assessment of `setjmp`/`longjmp` as extremely complex and prone to subtle errors, especially concerning compiler optimizations and stack management. While the article emphasized memory safety, some participants highlighted broader risks like resource leaks and CPU-specific state issues that these primitives introduce. The author clarified that Fil-C prioritizes direct memory corruption prevention, acknowledging other systemic risks. There was also a minor point of clarification regarding "ancestor" vs. "descendant" stack frames in `longjmp` validation.
`ucontext` Under Scrutiny: Performance and Potential
The discussion extended to the performance implications of `ucontext`, with some arguing that its signal mask switching makes it significantly heavier than custom, assembly-optimized fiber implementations (like those in Boost). The author acknowledged this overhead, noting it stems from relying on glibc's `ucontext` implementation, and indicated that future work might explore lighter, sigmask-free alternatives, prompting further conversation on custom signal handling strategies for cooperative multitasking.
Pointers' Predicament: Stack Portability Perils
A deep dive into why C's raw pointers complicate stack relocation—a technique used in languages with continuations or advanced GC—compared to environments where pointers are relative or managed. Commenters explored how other languages (like Go) manage pointers to enable stack portability and even reminisced about older architectural features like 80286 segmentation that could have simplified relocation, contrasting these with the challenges posed by C's direct memory access.
AI's Footprint in Fil-C's Development
A brief but relevant sidebar emerged about the role of AI/LLMs in Fil-C's development. The author clarified that while various LLMs (Claude, Kimi, GLM) have been used for ancillary, easily verifiable tasks like writing tests or simple code snippets, the core and complex `longjmp`/`ucontext` implementations discussed in the article were developed without AI assistance.