OpenAI Previews New Code Security Tool

OpenAI is rolling out a research preview of Codex Security (formerly Aardvark) to ChatGPT Enterprise, Business, and Pro users. The specialized model is designed for code security scanning and promises improved performance with fewer false positives.

This tool began its life inside OpenAI under the name "Aardvark," where it was used to secure the company's own code before its private beta. During internal testing, it surfaced critical vulnerabilities like a server-side request forgery (SSRF) and a cross-tenant authentication flaw, which were patched within hours. Unlike traditional Static Application Security Testing (SAST) tools that often rely on predefined patterns, Codex Security builds a custom, editable threat model for each specific project. It analyzes the repository to understand the system's structure, what it trusts, and its most exposed surfaces, providing context that helps it reason more like a human security expert. A major focus during its beta period was reducing alert fatigue. OpenAI reports that this context-aware approach has cut the rate of false positives by more than 50% and reduced overall "noise" by 84% compared to initial versions. It also slashed the rate of findings with over-reported severity by more than 90%. To validate its findings, the agent tests potential vulnerabilities in a sandboxed environment to confirm they can be exploited. This process can generate a working proof-of-concept, giving security teams stronger evidence and a clearer path to remediation before proposing a patch. OpenAI has also been using the tool to audit the open-source software supply chain. The company has already used it to find and report 14 vulnerabilities in widely-used projects that were severe enough to be registered in the CVE database. The tool enters a competitive landscape of AI-powered security platforms like Snyk and Cycode, which are also moving beyond legacy scanning. The goal of these next-generation tools is to shift security "left," integrating directly into developer workflows to catch vulnerabilities before they are deployed. For startups, where a single data breach can cost between $120,000 and $1.24 million, this level of automation is critical. With studies showing 25-30% of AI-assisted code generation introducing security flaws, tools that can accurately identify real threats without slowing down a small engineering team are becoming essential.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.