HN
Today

FSF Threatens Anthropic over Infringed Copyright: Share Your LLMs Freel

The Free Software Foundation, prompted by a copyright settlement against Anthropic for LLM training data, issued a strong statement advocating for 'freedom' over financial compensation. While not a direct threat to sue, their stance ignites debate over the viability of free software principles in the proprietary AI industry. This story captivates HN by pitting core free software ideology against the rapidly evolving legal and business models of large language models.

38
Score
12
Comments
#3
Highest Rank
12h
on Front Page
First Seen
Mar 20, 6:00 AM
Last Seen
Mar 20, 8:00 PM
Rank Over Time
20534914182326242523

The Lowdown

The Free Software Foundation (FSF) recently received notice of a class-action settlement in the lawsuit Bartz v. Anthropic, which alleged that Anthropic infringed copyright by using works from Library Genesis and Pirate Library Mirror datasets to train its large language models (LLMs).

  • The district court initially ruled that using books for LLM training constituted fair use, but the legality of downloading the works for this purpose remained an open question, leading to the settlement.
  • Among the works found in Anthropic's training data was "Free as in freedom," a book copyrighted by the FSF and published under the GNU Free Documentation License (GNU FDL).
  • While the GNU FDL permits use for any purpose without payment, it includes copyleft and attribution clauses, making it distinct from a public domain license.
  • The FSF, a non-profit with limited resources, stated that if they were involved in such a lawsuit and found their license violated, they would not seek monetary damages. Instead, they would demand "user freedom" as compensation, urging Anthropic and other LLM developers to release complete training inputs, models, configurations, and source code for their LLMs to users.

This event served as a platform for the FSF to reassert its core mission: to champion computing freedom by insisting that AI technologies built upon freely licensed works should themselves be made freely available, challenging the proprietary nature of current LLM development.

The Gossip

Headline Hyperbole

Many commenters immediately pointed out that the Hacker News submission's title, "FSF Threatens Anthropic over Infringed Copyright: Share Your LLMs Freel," was misleading. They clarified that the FSF's blog post was not a direct threat of legal action, but rather a declarative statement of their ideal settlement terms *if* they were to be involved in such a suit, given their ideological commitment to free software and the specific licensing of the affected work.

Licensing Labyrinth & LLM Legality

The discussion delved into the nuances of FSF's licenses (like GNU FDL) and their application to LLM training. Commenters debated whether training on works under these licenses, which typically include attribution and copyleft clauses, constitutes a violation, especially given the court's 'fair use' ruling for training itself. There was also skepticism about whether a book's documentation license (GNU FDL) could translate into obligations for the output of an LLM in the same way the GPL might for software.

Free Software's Financial Folly?

A recurring theme questioned the practicality and realism of the FSF's demands. Critics argued that requiring companies like Anthropic to release their entire models and training data "in freedom" would destroy their business models, making continued development financially impossible. Others countered that it's not the FSF's responsibility to ensure a proprietary company's profitability, especially if that model relies on practices that could be deemed copyright infringement.