XML Is a Cheap DSL
The IRS's new open-source Tax Withholding Estimator unexpectedly leverages XML as a domain-specific language (DSL) to model the complex U.S. tax code. The author presents a detailed, contrarian argument for XML's declarative strengths, inherent auditability, and robust, universal tooling, asserting it's a "cheap DSL" superior to JSON or imperative code for such intricate, rule-based systems. This technical deep dive challenges modern development norms, highlighting XML's often-overlooked utility in critical public sector software.
The Lowdown
The author, an engineering lead for the IRS's new open-source Tax Withholding Estimator (TWE), delves into what might seem like the driest aspect of its development: its use of XML. Challenging widespread perceptions of XML as clunky or obsolete, the piece argues that XML holds a vital place in modern software development, particularly as a cross-platform declarative specification for complex domains like the US Tax Code.
- TWE's core logic is built upon a "Fact Graph" engine, powered by two XML configurations, one of which is the "Fact Dictionary"—a representation of the US Tax Code itself.
- Through concrete examples, the author demonstrates how XML declaratively defines tax calculations (e.g.,
totalOwed,totalRefundableCredits) and inputs (totalEstimatedTaxesPaid), acknowledging its verbosity but emphasizing its clarity and straightforward nesting. - A direct comparison is drawn between XML's declarative nature and imperative JavaScript code, illustrating how the latter leads to issues with execution order, conflates implementation details with business logic, and lacks inherent introspection.
- The declarative, graph-based representation of the tax code in XML offers crucial benefits like automatic auditability and introspection, allowing the program to explain how a calculation was reached—a feature vital for a system as complex as tax estimation.
- The article contends that XML is fundamentally better suited than JSON for building DSLs, as its element-based structure naturally accommodates arbitrary nested expressions and custom data types, unlike JSON's object-only approach.
- Additional, often-forgotten benefits of XML, such as native support for comments and sane whitespace handling, are noted as enhancing human readability and editability.
- While acknowledging alternatives like s-expressions, Prolog, and KDL offer potentially "nicer" syntax, the author stresses that XML's greatest advantage is its universal tooling ecosystem, providing a parser and a vast array of utilities (like XPath) "for free."
- This is powerfully demonstrated with examples of simple bash one-liners using
xpathandfzfto quickly search and debug the Fact Dictionary, highlighting how XML's tooling empowers developers to build efficient custom workflows with minimal effort.
In conclusion, the story asserts that while JSON remains suitable for many data exchange scenarios, XML shines as an exceptionally "cheap" and effective choice when building a domain-specific language. Its inherent declarative qualities, combined with a mature and universal tooling ecosystem, provide significant cost-efficiency and power, making it a pragmatic choice for critical, complex systems like the IRS Tax Withholding Estimator.