Noroboto: Lying Fonts and Mitigation in Rust
This post reveals "lying fonts" – malicious TrueType fonts that deceive AI and LLMs by displaying one thing to humans while presenting entirely different Unicode characters to automated systems. Crucially, in legal documents, these can subtly alter meanings (e.g., changing "Maryland" to "Delaware") by exploiting how AI processes embedded fonts. Hacker News found this blend of technical exploit and real-world AI vulnerability, particularly in the legal tech sphere, both fascinating and concerning.
The Lowdown
The article introduces "Noroboto," a novel "lexploit" that leverages specially crafted fonts to mislead AI systems and LLMs about the content of a document. It highlights the vulnerability of complex modern tech stacks, especially in legal tech, where documents pass through numerous open-source and proprietary tools.
- The Exploit Mechanism: Noroboto fonts embed private use area (PUA) Unicode code points with custom glyphs that visually match standard characters. While humans see normal text, copying the text or processing it by a naive system reveals gibberish.
- Proof of Concept (PoC): A Python-based PoC (
noroboto.py) was created, initially using a simple substitution cipher. Early versions were easily cracked by ChatGPT 5.5, which performed cryptanalysis or read font metadata. - Advanced Obfuscation: The PoC was improved with polyalphabetic ciphers, 4-to-1 mappings, and perturbed font outlines, making it harder for LLMs to deobfuscate, though frontier models can still resort to OCR.
- Partial Obfuscation & Replacement: More effective attacks involve partially obfuscating key terms (e.g., hiding "successors and assigns" in an NDA) or outright replacing text (e.g., displaying "Maryland" but representing "Delaware" in Unicode). These methods often fool LLMs, which are "lazy" and prefer to trust facially valid Unicode over expensive rendering and OCR.
- Rust Mitigation: The author proposes a mitigation in Rust by rendering a font atlas of standard ASCII characters and performing OCR on it. The
character_accuracyfunction uses Levenshtein distance to compare the OCR result with the expected ASCII string, flagging any discrepancies. - Testing: The mitigation successfully identifies issues in a
norobotovariant while passing a standard Noto font, though the article notes the replacement attack requires at least one OCR failure to be detected.
This "lexploit" underscores a significant vulnerability for AI systems processing documents, especially in high-stakes fields like law. It exposes how reliance on underlying document specifications and a lack of robust verification can lead LLMs to misinterpret critical information.
The Gossip
Legality and Lexploit Limitations
Commenters questioned the practical threat model of these "lying fonts" in real-world legal contexts, asking if a court or human reviewer would be easily fooled or if such practices would be sanctionable. The author clarified that while the full obfuscation might be detected (e.g., by inability to Ctrl+F), more subtle 'replacement' attacks are highly effective against AI. They emphasized that the goal isn't necessarily a massive 'gotcha' but to highlight a class of vulnerabilities emerging with "AI native" firms and automated legal pipelines, where the "laziness" of LLMs could lead to significant legal issues.
Mitigation Musings
Discussion touched upon the proposed Rust-based mitigation, with one commenter suggesting that the reliance on verifying individual ASCII letters might be fooled by ligatures. The author, while not directly addressing the ligature point, highlighted that the "replacement" attack is considerably harder to mitigate effectively because it's not simply an obfuscation but a deliberate semantic change at the Unicode level, often fooling all tested platforms.
AI's Algorithmic Ailments
A core theme revolved around the vulnerability of LLMs and AI agents to these seemingly "lame" exploits. The author explained that while the PoC was "vibe coded in a day or two," it successfully demonstrates that current frontier models, despite their sophistication, are often "blind" to these issues. This is attributed to their reliance on underlying document processing pipelines and a tendency to be "lazy," preferring to trust readily available Unicode strings over the computationally expensive process of rendering and OCRing a document for verification.