Is legal the same as legitimate: AI reimplementation and the erosion of copyleft
A widely used Python library, chardet, was re-implemented with AI, switching its license from LGPL to MIT, sparking a fiery debate in the open-source world. This story critiques prominent figures who defend the move, arguing they confuse legal permissibility with social legitimacy and the true spirit of copyleft. Hacker News is buzzing over the ethical implications of AI's disruptive potential on software freedom and the future of intellectual property.
The Lowdown
The story centers on a contentious event: the chardet Python library, used by millions, saw its maintainer, Dan Blanchard, leverage Anthropic's Claude to re-implement it from scratch. This new version, claiming less than 1.3% code similarity and a 48x speed increase, was re-licensed from LGPL to MIT, igniting a significant debate within the open-source community.
Here are the key points of the story:
- Dan Blanchard,
chardet's maintainer, used AI (Claude) to re-implement the LGPL-licensed library, resulting in a significantly faster version and a license change to MIT. - The original author, Mark Pilgrim, objected, arguing the reimplementation, even with AI, wasn't a true 'clean-room' effort given the maintainer's intimate knowledge of the original codebase.
- Prominent open-source figures, Armin Ronacher (Flask creator) and Salvatore Sanfilippo (Redis creator), defended Blanchard's actions, citing legal permissibility and historical precedents like GNU's reimplementation of UNIX.
- The author critiques Ronacher and Sanfilippo, asserting they conflate legal compliance with social legitimacy, noting that GNU's actions expanded the commons, whereas
chardet's re-licensing restricts it. - Ronacher's argument that permissive licenses foster better 'sharing' is challenged, with the author stating copyleft ensures reciprocal contributions to the commons, preventing privatized gains from communal work.
- The author highlights a perceived hypocrisy in the permissive licensing camp, citing Vercel's outrage when their MIT-licensed Next.js was re-implemented by Cloudflare, despite advocating for such practices with GPL software.
- The piece suggests that AI's ability to easily circumvent copyleft does not negate its necessity but rather calls for its evolution, proposing concepts like 'training copyleft' or 'specification copyleft'.
Ultimately, the story argues that community norms and values, foundational to copyleft, should not be ignored or deemed irrelevant simply because new technology provides a legal workaround. The central question remains a social one: does benefiting from the open-source commons entail an obligation to contribute back?
The Gossip
Legitimacy Labyrinth
Commenters vigorously debated the article's core premise: whether something legally permissible is automatically socially or ethically legitimate. Many sided with the author, emphasizing the breach of a 'social compact' inherent in copyleft and questioning the 'clean-room' claim given the maintainer's long-standing involvement with the original code. Others pushed back, suggesting the author was imposing a moral framework on what should be a purely legal discussion, arguing that legal analysis shouldn't be conflated with personal ethical judgments. The discussion also touched on the 'intent' behind such reimplementations.
AI's Copyright Cataclysm
A dominant theme was the profound impact of AI, particularly LLMs, on copyright law and the future landscape of open-source licensing. Some commenters predicted that AI would fundamentally erode copyright's efficacy, potentially rendering copyleft obsolete and leading to a future where software either loses value or becomes predominantly closed-source. Others saw AI as a potential tool for FOSS, enabling new reimplementations to expand the commons, while simultaneously expressing concern that the capital-intensive nature of advanced LLMs could centralize power in large corporations. The question of whether LLM output can be considered public domain and the potential for new licensing models or increased reliance on software patents were also discussed.
Copyleft's Conundrum
The discussion delved into the fundamental nature and purpose of copyleft and copyright. Commenters debated whether copyleft truly restricts sharing or if it, in fact, enforces a necessary reciprocity, ensuring that improvements flow back into the commons. The specific nuances of GPL versus AGPL regarding networked services were clarified. Some argued that copyright itself is 'nonsense' and a system that disproportionately benefits large corporations, while others brought in Richard Stallman's perspective on copyleft's reliance on strong copyright to achieve its goals. The difficulty of defining 'derivative work' and 'creative expression' in the age of AI was also a point of contention.