AI‑content backlash grows

Publishers and creators are pushing back against a surge of AI‑generated submissions and the scraping of online content, raising questions about provenance and copyright. Recent reporting says a cancelled book and publisher restrictions on archive access have highlighted reputational and legal risks, and some voices are calling for compensation rather than uncompensated use of creators’ work. (CBC News, Times Now, San Francisco Examiner)

A fight over artificial intelligence is now hitting books, news archives, and publisher revenue at the same time. (locusmag.com) On March 24, 2026, Hachette Book Group canceled the planned United States release of *Shy Girl* by Mia Ballard and stopped the United Kingdom edition after what it called a review for signs of generative artificial intelligence use. Ballard told The New York Times, as quoted by Locus, that she did not use artificial intelligence to write the novel and said an editor she hired had used it on the self-published version. (locusmag.com) The archive fight widened in January 2026, when Nieman Lab reported that The Guardian had limited the Internet Archive’s access to article pages and Wayback Machine interfaces after seeing frequent Internet Archive crawling in its logs. The Financial Times also blocks bots that scrape paywalled content, including the Internet Archive, according to Nieman Lab. (niemanlab.org) The Internet Archive says it preserves the web; publishers say those preserved copies can become a backdoor for artificial intelligence companies seeking structured text. Nieman Lab quoted Guardian business affairs and licensing head Robert Hahn saying the Archive’s application programming interface was “an obvious place” for machines to extract intellectual property. (niemanlab.org) The legal backdrop is still unsettled. The United States Copyright Office said its artificial intelligence initiative began in 2023, with Part 2 of its report released on January 29, 2025 on whether outputs made with generative artificial intelligence can be copyrighted, and a pre-publication Part 3 released on May 9, 2025 on training. (copyright.gov) Publishers are also arguing that the economics no longer work if artificial intelligence systems answer questions without sending readers back. Times Now, citing TollBit, said news sites and blogs get 96 percent less referral traffic from artificial intelligence search engines than from traditional Google search, and cited Raptive estimates of about $2 billion in annual publisher revenue loss. (timesnownews.com) That has pushed more companies toward licensing and payment demands instead of open-ended scraping. Times Now said media groups in Europe, the United States, and India are pressing for permission and compensation when copyrighted work is used to train large language models, the systems that learn patterns from vast text collections and generate answers from those patterns. (timesnownews.com) Critics of the publisher response say the clampdown is catching archives along with commercial artificial intelligence firms. The Electronic Frontier Foundation wrote on March 16, 2026 that blocking the Internet Archive will not stop artificial intelligence training on its own and will leave gaps in a record used by journalists, researchers, and courts. (eff.org) The split is now plain: publishers want control, licenses, and payment; archivists want preservation; authors want proof that a human actually made the work. The next tests are likely to come from contracts, court rulings, and the submission inboxes that publishers are now treating far more cautiously. (copyright.gov)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.