AI bots strip‑mine the web

Cloudflare data show AI bots are scraping large swaths of the web and returning little referral traffic, with Anthropic named among heavy users. (businessinsider.com) Publishers and sites say that pattern is straining referral-based publishing economics because content is consumed without sending visits back. (businessinsider.com)

AI bots are crawling huge volumes of web pages while sending back few visitors, and Cloudflare’s data put Anthropic’s Claude among the heaviest users. (blog.cloudflare.com) Cloudflare said on August 29, 2025 that training-related activity made up nearly 80 percent of AI bot traffic, up from 72 percent a year earlier. In the same dataset, ClaudeBot accounted for about 10 percent of AI crawling traffic in July 2025, up from 6 percent earlier in the year. (blog.cloudflare.com) Cloudflare’s key measure is the “crawl-to-refer” ratio: how many pages a bot fetches for each visitor it sends back. In July 2025, Cloudflare said Anthropic’s ratio was about 38,000 to 1, down from roughly 286,000 to 1 in January, while OpenAI’s GPTBot share of AI crawling traffic had also more than doubled to 11.7 percent. (blog.cloudflare.com) That imbalance lands on a web economy built around links, pageviews, ads, and subscriptions. Cloudflare said search engines long indexed pages and returned users to publishers, but AI systems increasingly answer inside the product instead of sending readers onward. (cloudflare.com) Cloudflare tied the shift to falling publisher traffic. In its news-publisher dataset across the Americas, Europe, and Asia, Google referrals declined from February 2025, and March 2025 referrals were about 9 percent below January levels as AI Overviews expanded. (blog.cloudflare.com) The fight is now moving from analytics to access controls. On July 1, 2025, Cloudflare said it would block AI crawlers by default unless site owners granted permission, and it began testing a “pay per crawl” system so publishers could charge bots for access. (cloudflare.com) Publishers backed that move. Columbia Journalism Review reported that the Associated Press, Time, The Atlantic, and Reddit were among the organizations that signed on with Cloudflare as news outlets looked for ways to slow scraping and recover leverage over licensing. (cjr.org) Anthropic says site owners can block its bots in robots.txt and that its crawlers respect anti-circumvention tools such as CAPTCHAs. Its support page also says publishers can set a crawl delay and must list each subdomain they want excluded. (support.claude.com) That leaves the web with a simple dispute over who gets paid when an answer engine uses someone else’s work. Cloudflare is publishing the ratios, publishers are tightening permissions, and AI companies are being pushed to prove that crawling brings something back. (blog.cloudflare.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.