XOR'ing a register with itself is the idiom for zeroing it out. Why not sub?
Raymond Chen dissects the subtle battle between xor eax, eax and sub eax, eax for zeroing registers, explaining why the former became the prevailing idiom. This deep dive into x86 assembly quirks, compiler history, and micro-architectural optimizations is precisely the kind of low-level esoterica that Hacker News users relish. It showcases how minute technical differences and historical happenstance shape fundamental programming practices.
The Lowdown
Raymond Chen, in his "The Old New Thing" blog, explores the enduring question of why xor eax, eax is the widely accepted idiom for zeroing out a register on x86 architecture, rather than the seemingly equally viable sub eax, eax. He builds upon Matt Godbolt's explanation that xor eax, eax is more compact than mov eax, 0, but dives deeper into why sub didn't win out.
- Both
xor eax, eaxandsub eax, eaxachieve the same result (zeroing a register) and encode to the same number of bytes, offering a size advantage overmov eax, 0. - A key technical distinction lies in their effect on CPU flags:
sub eax, eaxclears the Auxiliary Flag (AF), whilexor eax, eaxleaves it undefined. - Chen speculates that
xor's dominance was due to "swarming" – a slight initial lead, perhaps perceived as more "clever," leading early compilers to adopt it, which in turn influenced other developers. - Modern Intel CPUs specifically detect both
xor r, randsub r, rinstructions, optimizing them to break dependency chains and effectively execute in zero cycles. - Despite this,
xorcemented its win due to fears that other CPU manufacturers might only have optimizedxorand notsub. - Itanium processors are an exception; they have a dedicated zero register, and
xordoesn't correctly handle their NaT (Not a Thing) bits.
Ultimately, the article illustrates how small technical nuances, combined with historical momentum and developer perception, can enshrine specific idioms in assembly programming, even when technically similar or marginally superior alternatives exist. It's a fascinating look at the interplay between hardware design, compiler choices, and programmer culture.