Google warns web pages poison agents

- Google Threat Intelligence Group and Google DeepMind said on April 23 that attackers are already planting indirect prompt injections on public web pages. - Google said it scanned Common Crawl’s monthly snapshots of 2 billion to 3 billion English-language pages to measure real-world prompt-injection abuse. - The finding shifts prompt injection from theory to observed web abuse in deployed agent workflows. (security.googleblog.com)

AI agents are software that read webpages, emails, and documents, then decide what to do next. Google says that same reading ability lets hostile pages slip in hidden instructions. (security.googleblog.com) Google Threat Intelligence Group and Google DeepMind said on April 23 that they found indirect prompt injection patterns on the public web after a broad sweep for known abuse. The team used Common Crawl, a monthly archive of roughly 2 billion to 3 billion English-language pages. (security.googleblog.com) Indirect prompt injection is different from a user directly trying to jailbreak a chatbot. The attack hides instructions inside content an agent consumes, so the model may follow the attacker’s text instead of the user’s request. (security.googleblog.com) Google’s researchers said the public web is an easy place for attackers to seed these instructions because agents increasingly browse sites to complete tasks. Common Crawl does not include most login-walled social media, so Google said its findings cover mainly static sites such as blogs, forums, and comments. (security.googleblog.com) Earlier this month, Google DeepMind researchers also mapped the broader problem in a paper describing six categories of “AI Agent Traps.” Those categories include content injection, semantic manipulation, cognitive state, behavioral control, systemic, and human-in-the-loop traps. (securityweek.com) SecurityWeek reported that the paper describes hidden instructions in HTML comments or metadata, JavaScript-delivered traps, and language designed to exploit an agent’s reasoning. The paper’s framing is that the environment, not the model weights, becomes the attack surface. (securityweek.com) Google has been pushing layered defenses rather than a single fix. Its Model Armor service screens prompts and responses for prompt injection, jailbreak attempts, sensitive data, and malicious content before and after a model call. (docs.cloud.google.com) Google’s cloud guidance also tells customers to run agents in constrained environments and use network boundaries such as Virtual Private Cloud Service Controls where possible. In its Workspace security guidance, Google said it is also adding model resistance, sanitization, user confirmation, and suspicious-link defenses around Gemini features. (docs.cloud.google.com) (security.googleblog.com 1) (security.googleblog.com 2) The immediate point of Google’s April 23 post is narrower than a generic warning: the company says this abuse is already showing up on real pages, not just in lab demos. As more companies let agents browse, summarize, click, and retrieve data, that turns ordinary web content into a live security boundary. (security.googleblog.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.