AI Tech Suite

Meta Wins AI Training Lawsuit, Judge Rejects Broad Copyright Shield

Meta wins lawsuit, yet judge warns future AI copyright battles loom over data use and market harm.

June 26, 2025

Meta Wins AI Training Lawsuit, Judge Rejects Broad Copyright Shield

A US federal court has delivered a significant, albeit nuanced, victory to Meta in a closely watched copyright lawsuit, dismissing claims from authors who alleged the company unlawfully used their books to train its Llama family of artificial intelligence models. While the ruling allows Meta to fend off this particular challenge, the presiding judge issued a stark warning that the decision does not provide a legal shield for the broader practice of using copyrighted materials for AI training. U.S. District Judge Vince Chhabria in San Francisco made it clear that the outcome was a result of flawed legal arguments by the plaintiffs and should not be interpreted as a blanket approval of Meta's methods, leaving the door open for future cases to unfold differently.[1][2] The decision highlights the intricate and evolving legal landscape surrounding generative AI and intellectual property.

The class-action lawsuit, led by authors including Sarah Silverman, Richard Kadrey, and Ta-Nehisi Coates, accused Meta of multiple forms of copyright infringement.[3][4] The core of their argument was that Meta had engaged in the unauthorized copying of their books, allegedly sourced from pirated online libraries like "ThePile" and "Books3," to serve as training data for its Large Language Models (LLaMA).[3] The authors put forward several legal theories, including direct copyright infringement through unauthorized copying for training, vicarious copyright infringement, and the claim that the Llama models themselves constituted infringing derivative works of their books.[5][3] They also alleged violations of the Digital Millennium Copyright Act (DMCA), unfair competition, and unjust enrichment, seeking damages and injunctive relief.[5][3]

In his ruling, Judge Chhabria dismissed most of the authors' claims, focusing on the specific legal arguments presented. He characterized the theory that the Llama models are themselves infringing derivative works as "nonsensical," stating that the models are not a "recasting or adaptation" of the plaintiffs' books.[5] To prove vicarious copyright infringement, the judge explained, the plaintiffs needed to show that the outputs from Llama were substantially similar to their copyrighted works, which they failed to do.[5][6] The court found that without a plausible allegation of an infringing output that incorporates protected expression from the books, the claim could not stand.[5][6] Similarly, the DMCA claims failed because the authors did not sufficiently allege that Llama generated and distributed copies of their books with copyright management information removed.[5] The state-law claims were also dismissed as being preempted by the federal Copyright Act.[5] However, one key claim was not dismissed: the allegation of direct copyright infringement based on the initial act of copying the books for training. Meta had not moved to dismiss this specific claim.[5]

Despite the victory for Meta, Judge Chhabria's decision was laced with cautionary language that could have significant implications for the AI industry. He explicitly stated, "This ruling does not stand for the proposition that Meta's use of copyrighted materials to train its language models is lawful. It stands only for the proposition that these plaintiffs made the wrong arguments and failed to develop a record in support of the right one."[1][2] The judge expressed sympathy for the argument that generative AI could undermine the market for original creative works.[1] He noted the potential for AI to "dramatically undermine the incentive for human beings to create things the old-fashioned way" by flooding the market with content.[1] He suggested a more potent argument for future plaintiffs might focus on how tech companies are building multi-billion or trillion-dollar tools that enable the creation of a vast stream of competing works, significantly harming the market for the original books used in training.[7][8] This indicates that while the current case was dismissed on technical grounds, a different legal strategy focusing on market harm could prove successful.

The ruling in the Meta case arrived in the same week as a related but distinct decision in a lawsuit against AI company Anthropic.[9] In that case, another San Francisco federal judge, William Alsup, found that training an AI model on copyrighted books could be considered "fair use" because it is "quintessentially transformative."[9][10][11] However, Judge Alsup also ruled that Anthropic must face trial over its use of pirated books, drawing a clear line between the act of training and the legality of how the training data was acquired.[9][12] Together, these two rulings illustrate the nuanced and case-by-case approach courts are taking. They suggest that while the transformative nature of AI training might offer some defense under fair use, the source of the training data and demonstrable evidence of market harm will be critical factors in future litigation. The legal battles are far from over, and these early decisions are setting the stage for a protracted and complex debate over the intersection of copyright law and artificial intelligence.[13]