US Publishing Grapples with AI Training Rights
The U.S. publishing industry is facing new tensions over the use of copyrighted books for training large language models. Authors are reportedly pushing back against contract clauses that permit their work to be used for AI training without explicit consent or compensation. In response, some publishers are revising contract terms to better protect their intellectual property from unlicensed scraping.
- The Authors Guild, along with 17 authors, filed a class-action lawsuit against OpenAI and Microsoft for copyright infringement, specifically focusing on the use of fiction to train AI models. A similar lawsuit has been filed by a group of nonfiction writers. - In a notable development, publishers Cengage Group and Hachette Book Group filed a motion to intervene in a class-action lawsuit against Google, arguing that they have distinct ownership and licensing interests that are not adequately represented by the author-led class. - In response to authors' concerns, the Authors Guild has drafted and recommended new model contract clauses that would prohibit publishers from using an author's work for AI training without their express permission. These clauses also address AI-generated translations, audiobook narration, and cover art. - Some publisher agreements now include clauses that explicitly restrict the use of licensed content for any AI training, development, or enrichment of AI tools accessible to third parties. - The U.S. Copyright Office has reiterated that works generated solely by AI are not eligible for copyright protection because they lack human authorship, a key requirement for copyright. - Recent research has shown that some large language models can reproduce copyrighted books with high accuracy, which could become key evidence in ongoing copyright lawsuits. - While some publishers are moving to license their content to AI developers, there is no industry-wide consensus on the best approach to compensation and credit for the use of their material. - In the absence of clear legislation, the legal battles are largely centered on the interpretation of the "fair use" doctrine, with AI companies arguing their training processes are transformative and thus permissible.