Internet Archive alarms

A WIRED post circulating on X has highlighted mounting threats to the Internet Archive’s web‑archiving work, raising concerns for preservation of digital books, zines, fanfiction and other online writing. The conversation has become a focal point for people tracking how web‑born literature and serialized online stories are being saved or lost (x.com).

The Internet Archive’s web memory is getting harder to make: major publishers are blocking its crawlers, while the nonprofit is still absorbing lawsuits and cyberattacks. (wired.com) The Internet Archive says it now holds more than 1 trillion archived web pages through the Wayback Machine, the service journalists, researchers, and courts use to check what a site looked like on a given day. (archive.org, eff.org) In January 2026, Nieman Journalism Lab reported that publishers including The Guardian and the Financial Times had limited Internet Archive access as they tried to stop artificial intelligence companies from reusing archived material. The Guardian said it was excluding article pages from some Internet Archive tools while leaving homepages and topic pages visible. (niemanlab.org) By April 14, 2026, reporting cited an Originality AI analysis saying 23 major news sites were blocking `ia_archiverbot`, the crawler commonly used for the Wayback project, and Reddit was blocking it too. USA Today said it was blocking scraping bots broadly and was “not specifically seeking to block the Internet Archive.” (forbes.com, 9to5mac.com) A web archive works like a time-stamped photocopier for pages that can be edited, paywalled, or deleted after publication. When those crawlers are blocked, later readers can lose the easiest public record of what was posted, changed, or removed. (eff.org) That reaches beyond news articles. The same preservation logic matters for web-born writing that often lives on fragile platforms: personal essays, zines, forum posts, serialized fiction, and fan communities whose work can disappear with a policy change, a shutdown, or an account deletion. Archive of Our Own, one of the largest fanfiction repositories, says it hosts more than 17 million works across more than 77,000 fandoms. (archiveofourown.org, transformativeworks.org) The Archive is also under legal pressure outside web crawling. On September 4, 2024, the United States Court of Appeals for the Second Circuit ruled against the Internet Archive in the Hachette book case, rejecting its fair-use defense for controlled digital lending of scanned books. (law.justia.com, copyright.gov) In April 2025, the Internet Archive said a separate lawsuit from major record labels over preserved 78 revolutions-per-minute recordings sought $700 million and called the case “an existential threat” to the institution and “everything we preserve — including the Wayback Machine.” (blog.archive.org) The nonprofit is still recovering from the attacks of October 2024. Internet Archive posts said a distributed denial-of-service attack, exposure of patron email addresses and encrypted passwords, and a site defacement forced services offline; by October 21, 2024, the Wayback Machine had resumed and archive.org had returned in read-only form. (blog.archive.org, blog.archive.org) The fight now is over whether the web keeps a public memory that can be checked later. The more sites that close their doors to archiving, the more of the internet’s first draft survives only at the discretion of the people who published it. (wired.com, eff.org)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.