Carlini demos Claude exploit

Nicholas Carlini demoed Claude finding zero‑day vulnerabilities in Ghost and the Linux kernel—an illustration of how large language models can be weaponized for vulnerability discovery and adversarial ML research. The demo underlines security implications for any company building or deploying LLM tooling. (x.com)

Nicholas Carlini delivered a talk titled "Black‑hat LLMs" at the [un]prompted AI‑security conference; the full presentation is available on YouTube and is listed in the event agenda. (youtube.com) In the live demo Carlini showed a scaffolded Claude setup that reportedly located a blind SQL‑injection in the Ghost CMS and extracted an administrator API key in about 90 minutes. (ai-primer.com) Anthropic published a technical note on Feb. 5, 2026 describing Claude Opus 4.6 and asserting the model had identified hundreds of previously unknown high‑severity vulnerabilities during internal evaluations. (red.anthropic.com) Carlini’s team also disclosed an “agent teams” experiment in which 16 parallel Claude agents produced a roughly 100,000‑line Rust codebase (a C compiler) over ~2,000 Claude Code sessions at an estimated API cost of ~$20,000 to stress‑test the model’s software development capabilities. (anthropic.com) Anthropic then ran a two‑week collaboration with Mozilla in which Claude Opus 4.6 reported 22 Firefox vulnerabilities (14 of them rated high severity) that were subsequently patched and assigned CVEs. (anthropic.com) Independent coverage and writeups note that the public record for Carlini’s zero‑day demonstrations currently consists mainly of the conference talk, Anthropic blog posts, and partner disclosures rather than a reproducible exploit package or full technical paper. (ai-primer.com)

Carlini demos Claude exploit

Get your own daily briefing