HN
Today

Memory Safe Context Switching

This technical deep dive explores how Fil-C, a memory-safe C variant, tackles the notoriously tricky setjmp/longjmp and ucontext APIs. It details the subtle ways these C features can lead to memory corruption, showcasing Fil-C's innovative approaches to ensure safety, even integrating with its garbage collector. The article resonates with developers grappling with low-level C intricacies and the quest for safer systems programming.

114
Score
23
Comments
#4
Highest Rank
17h
on Front Page
First Seen
Jun 30, 1:00 AM
Last Seen
Jun 30, 5:00 PM
Rank Over Time
444545655813131316162125

The Lowdown

The article delves into how Fil-C achieves memory safety for C's context-switching primitives: setjmp/longjmp and the ucontext APIs. These functions, while powerful for exception handling and coroutines, are highly prone to misuse in traditional C (dubbed "Yolo-C"), often leading to stack corruption and hard-to-debug crashes. Fil-C aims to eliminate these vulnerabilities.

Making setjmp/longjmp Memory Safe

  • The "Returns Twice" Enigma: The core issue with setjmp is its ability to return twice, complicating compiler optimizations regarding variable storage (registers vs. stack spill slots). Compilers typically apply special handling for setjmp to prevent incorrect reuse of memory. However, obfuscating setjmp calls can bypass these safeguards, leading to undefined behavior or crashes.
  • Fil-C's Safeguards: Fil-C implements several mechanisms:
    • jmp_buf becomes an opaque pointer to a runtime-managed zjmp_buf object, preventing direct manipulation.
    • Compiler-level recognition of setjmp is enforced to ensure optimizations are correctly inhibited.
    • zjmp_buf objects are associated with stack frames, allowing validation that longjmp targets a valid, active frame. This prevents jumps to deallocated or overwritten stacks.
    • The runtime saves and restores internal state, including Garbage Collector (GC) roots, ensuring GC correctness across context switches. Acknowledged as a complex area, potential bugs here would manifest as incorrect GC root restoration.

Making ucontext Memory Safe

  • The ucontext APIs: getcontext, setcontext, makecontext, and swapcontext are used for more structured context switching, particularly for implementing fibers and coroutines. They allow creation of new execution contexts with their own stacks.
  • Fil-C's Safety Laws for ucontext:
    • ucontext_t objects contain an opaque zfiber_context managed by the runtime, disallowing direct user access to internal state.
    • Fil-C ignores the user-provided stack pointer (ss_sp), allocating its own internal stack for zfiber_context based on ss_size, with added padding for stack overflow detection.
    • A strict state machine for zfiber_context objects prevents illegal transitions (e.g., switching to an uninitialized context or attempting longjmp-like behavior).
    • Thread affinity ensures a zfiber_context cannot be used across different threads, preventing cross-thread stack corruption.
  • GC Integration Challenges: Integrating ucontext with Fil-C's on-the-fly garbage collector posed challenges, particularly concerning "grey fibers"—fibers whose stacks might change state (runnable to running and back) during a GC marking phase. Fil-C's solution involves tracking these grey fibers and re-scanning their stacks as needed to prevent objects from being missed by the GC. This is also identified as a highly intricate area where bugs could lurk.

Conclusion: Fil-C successfully provides memory-safe implementations of longjmp/setjmp and ucontext, supporting even these "depraved features" of C. While setjmp/longjmp support is robust, ucontext support is newer and still undergoing testing.

The Gossip

Devious Depths of `setjmp`/`longjmp`

Commenters largely agreed with the article's assessment of `setjmp`/`longjmp` as extremely complex and prone to subtle errors, especially concerning compiler optimizations and stack management. While the article emphasized memory safety, some participants highlighted broader risks like resource leaks and CPU-specific state issues that these primitives introduce. The author clarified that Fil-C prioritizes direct memory corruption prevention, acknowledging other systemic risks. There was also a minor point of clarification regarding "ancestor" vs. "descendant" stack frames in `longjmp` validation.

`ucontext` Under Scrutiny: Performance and Potential

The discussion extended to the performance implications of `ucontext`, with some arguing that its signal mask switching makes it significantly heavier than custom, assembly-optimized fiber implementations (like those in Boost). The author acknowledged this overhead, noting it stems from relying on glibc's `ucontext` implementation, and indicated that future work might explore lighter, sigmask-free alternatives, prompting further conversation on custom signal handling strategies for cooperative multitasking.

Pointers' Predicament: Stack Portability Perils

A deep dive into why C's raw pointers complicate stack relocation—a technique used in languages with continuations or advanced GC—compared to environments where pointers are relative or managed. Commenters explored how other languages (like Go) manage pointers to enable stack portability and even reminisced about older architectural features like 80286 segmentation that could have simplified relocation, contrasting these with the challenges posed by C's direct memory access.

AI's Footprint in Fil-C's Development

A brief but relevant sidebar emerged about the role of AI/LLMs in Fil-C's development. The author clarified that while various LLMs (Claude, Kimi, GLM) have been used for ancillary, easily verifiable tasks like writing tests or simple code snippets, the core and complex `longjmp`/`ucontext` implementations discussed in the article were developed without AI assistance.