HN
Today

Hunting a 34 year old pointer bug in EtherSlip (DOS Networking)

A classic retro-computing detective story unfolds as a developer hunts down a 34-year-old pointer bug in the obscure EtherSLIP DOS networking driver. The culprit? A subtly misplaced segment register write in x86 assembly, corrupting a compiler-monitored memory region during ARP response simulation. This deep dive into DOS internals and segmented memory perfectly caters to Hacker News's love for intricate low-level debugging and legacy tech.

5
Score
0
Comments
#15
Highest Rank
7h
on Front Page
First Seen
Apr 22, 3:00 AM
Last Seen
Apr 22, 9:00 AM
Rank Over Time
19221520242529

The Lowdown

The author shares an adventure in debugging a 34-year-old pointer bug found while revisiting an old DOS networking setup using EtherSLIP (Serial Line Internet Protocol) to emulate an Ethernet packet driver. While troubleshooting a slow connection, an unexpected "NULL assignment detected" error appeared upon Telnet exit, signaling deeper memory corruption rather than a simple NULL pointer dereference. This error prompted a meticulous investigation into the inner workings of 16-bit DOS networking and x86 assembly.

  • Initial Detection & Compiler Clue: The error, triggered by packet loss and retries when using EtherSLIP, was detected by Open Watcom 1.9's runtime heap check. This check monitors a reserved 32-byte region (__nullarea) at the start of the data segment for unauthorized writes, indicating potential NULL pointer corruption.
  • Debugging Strategy Evolution: Initial attempts with if-checks for NULL pointers in the application code proved fruitless. The author then implemented a custom runtime check mirroring the compiler's __nullarea monitoring, allowing real-time detection of the corruption.
  • Pinpointing the Corruption: The custom check identified that the memory corruption occurred specifically after the int86x call to the packet driver's Packet_send_pkt function. The corrupted bytes consistently matched a MAC address, leading the investigation into EtherSLIP's ARP (Address Resolution Protocol) handling.
  • x86 & ARP Context: The story provides a concise overview of 16-bit x86 segmented memory architecture (segment:offset addressing) and the ARP protocol, both crucial for understanding the bug's context within the EtherSLIP driver.
  • The Root Cause - Segment Register Mix-up: The bug was located in EtherSLIP's assembly routine for simulating an ARP response. A sequence of instructions intended to copy a source MAC address inadvertently clobbered the ES (Extra Segment) register, replacing it with the DS (Data Segment) of the calling application.
  • Consequence of the Bug: This incorrect ES value, combined with a calculated DI offset, caused a rep movsb instruction to write the MAC address not into the intended ARP response buffer, but directly into the __nullarea of the application's data segment, triggering the "NULL assignment detected" error.
  • Why it Persisted: The bug remained hidden for decades due to several factors: it was masked if DS and ES happened to be the same (e.g., in small memory models), applications typically don't inspect Ethernet headers for ARP responses, the corruption often landed in non-critical memory, and not all compilers had such specific runtime checks. This detailed forensic analysis not only fixed a 34-year-old bug but also serves as a potent reminder of the complexities of low-level programming, the perils of segmented memory in x86, and the enduring value of rigorous debugging and compiler diagnostics, even in the realm of antiquated systems. The story champions the adage: "Take every warning seriously, and get to the root cause."