Publishers and Turow sue Meta

- Elsevier, Cengage, Hachette, Macmillan, McGraw Hill, and novelist Scott Turow sued Meta and Mark Zuckerberg on May 5 over Llama training. - The complaint says Meta copied millions of books and journal articles from pirate sources including LibGen and Anna’s Archive, then built an “infinite substitution machine.” - This opens a new front in the AI copyright fight — now with major publishers, not just individual authors, pressing the licensing question.

Books are the new front line in the AI copyright wars. On May 5, five major publishers and novelist Scott Turow sued Meta and Mark Zuckerberg in federal court in Manhattan, saying Meta copied millions of copyrighted works to train Llama. The core fight is simple to describe but huge in consequence — can an AI company ingest protected books at industrial scale without paying, or does that blow a hole through copyright law? This case matters because it brings in the companies that actually own and license a lot of the material, not just individual writers. ### Who sued Meta? The plaintiffs are Elsevier, Cengage, Hachette Book Group, Macmillan, McGraw Hill, and Scott Turow. That mix matters. It pulls together trade publishing, textbooks, and academic journals in one case, which makes the complaint broader than a typical novelist-versus-tech-company dispute. The suit is framed as a proposed class action on behalf of similarly situated copyright owners. ### What are they accusing Meta of doing? Basically, they say Meta built Llama using unauthorized copies of books, scholarly articles, and other text works. The complaint says Meta downloaded material from pirate sources like LibGen and Anna’s Archive, scraped huge amounts of internet text, and removed copyright-management information from works it used. The plaintiffs call it one of the largest copyright infringements in history. ### Why is Zuckerberg named personally? That is one of the sharpest parts of the case. The complaint does not just name Meta as the company that benefited. It says Zuckerberg personally authorized and encouraged the conduct, including decisions around using pirated datasets instead of licensing content. That raises the pressure because it tries to attach responsibility directly to the CEO, not just to a giant corporate machine. ### What does “infinite substitution machine” mean? It is the publishers’ way of describing the market harm. Their argument is not only that Meta copied books on the way in. It is that Llama can then generate summaries, responses, and sometimes passages that compete with the originals on the way out. Think of it like feeding a library into a machine output starts to look like a substitute for the protected work and for the licensing market around it. ### How is this different from earlier author lawsuits? Earlier cases against AI companies often came from authors, artists, or news organizations one group at a time. This one is notable because major publishers are now moving as plaintiffs in their own right. That means the people who negotiate rights, sell institutional access, and run large licensing businesses are directly asking the court to draw a line around AI training. ### What is Meta likely to argue? Meta has defended AI training in other cases as lawful and transformative, with companies in this space generally arguing that models learn patterns rather than storing books as books. The catch is that this complaint leans hard on alleged piracy and deliberate avoidance of licensing. That can make the case feel less like an abstract fair-use debate and more like a facts-and-conduct fight over how the training corpus was assembled. ### Why does this matter beyond Meta? Because this is the question hanging over the whole generative AI business — whether training data is a free raw material or a paid input. If publishers win meaningful damages or force licensing, AI development gets more expensive and more formalized. If Meta wins, that strengthens the industry’s argument that large-scale training on copyrighted text can happen without permission. ### Bottom line This suit is not just about one company and one model. It is a test of whether copyright still has teeth when the copier is an AI system and the copying happens at machine scale.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.