Publishers sue Meta over Llama training

- Elsevier, Cengage, Hachette, Macmillan, McGraw Hill, and novelist Scott Turow sued Meta and Mark Zuckerberg on May 5 in Manhattan. - The complaint says Meta copied millions of books and articles through pirate libraries and torrenting, then used them to build Llama. - The case lands after mixed 2025 AI copyright rulings, so it could shape what “fair use” means for model training.

Books are the object here — and the fight is really about whether AI companies can treat copyrighted libraries as raw material. That question has been hanging over generative AI for years, but the rules are still blurry. On May 5, five big publishers — Elsevier, Cengage, Hachette, Macmillan, and McGraw Hill — plus novelist Scott Turow filed a proposed class action in federal court in Manhattan against Meta and Mark Zuckerberg. They say Meta copied millions of books and journal articles without permission to train Llama. (money.usnews.com) ### Who is suing whom? The plaintiffs are not a random author coalition. They include some of the biggest names in trade, educational, and academic publishing, plus Turow, who is both a bestselling novelist and a lawyer. The defendants are Meta Platfo(money.usnews.com)lding one of the most widely distributed open-weight AI model families. (money.usnews.com) ### What do they say Meta actually did? The core allegation is simple: Meta wanted a giant text corpus for Llama and took it instead of licensing it. The complaint says Meta obtained unauthorized copies through torrenting, web scraping, and pirate lib(money.usnews.com)ation from works along the way. That is important because the case is not only about abstract “learning from data” — it is also about how the data was acquired and copied. (news.bloomberglaw.com) ### Why name Zuckerberg personally? The publishers are trying to make this look deliberate, not accidental. Their filing says Zuckerberg personally authorized or encouraged the conduct behind the copying. That raises the stakes for Meta because it pushes the case beyond a stan(news.bloomberglaw.com)his as a business choice, not an engineering side effect. (apnews.com) ### What are they asking the court for? Money, first. The suit seeks damages and asks to represent a broader class of copyright owners whose registered books, journal articles, or other published works were allegedly used the same way. But the bigger ask is practical: limits on(apnews.com)more than one company’s legal bill. It would go straight at how frontier models are built and shipped. (money.usnews.com) ### What is Meta’s defense likely to be? Meta has already signaled the basic argument — fair use. The company said training AI on copyrighted material can qualify as fair use, and that it will fight the case aggressively. That defense says the model i(money.usnews.com)ot settled where that argument stops working, especially when the source material may have been pirated. (money.usnews.com) ### Why does this land now? Because 2025 did not settle the issue — it complicated it. One California judge ruled for Meta in the Kadrey case on fair-use grounds for training, while still leaving room for future plaintiffs to argue market harm more co(money.usnews.com)r map: training can win in court sometimes, but piracy facts and market-substitution theories still have real bite. (authorsalliance.org) ### Why are publishers so focused on “market harm”? Because copyright cases often turn on whether the new use substitutes for the original market. The complaint says Llama can generate summaries, mimic styles, and in some cases spit back close or verbatim p(authorsalliance.org)ll-purpose stand-in for textbooks, journal explainers, or genre fiction, the publishers’ strongest argument is not just “you copied us,” but “you built a machine that eats the market for what we sell.” (cbsnews.com) ### Bottom line This is not just another AI lawsuit. It is a cleaner test of a question the industry still has not answered: can a model maker claim fair use after vacuuming up copyrighted books at huge scale, especially if the pipeline ran through pirate libraries? However this case ends, it will help decide whether AI training looks more like research — or more like mass infringement. (news.bloomberglaw.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.