Microsoft's new AI shocks OpenAI
- Microsoft published details on May 12 about MDASH, a multi-model AI security system, after a May 15 YouTube video cast it as a shock to OpenAI. - Microsoft said MDASH scored 88.45% on CyberGym, ahead of Anthropic's Mythos Preview at 83.1%, while GeekWire reported the results were self-reported. - Microsoft said MDASH is in limited private preview, with signup details on its May 12 Security Blog post.
Microsoft published a May 12 blog post describing MDASH, a new AI security system that it said outperformed Anthropic’s Mythos Preview and OpenAI’s GPT-5.5 on a public cybersecurity benchmark. A May 15 YouTube video titled “Microsoft’s New AI Beats Mythos And Shocks OpenAI” amplified that claim, but the underlying evidence available publicly comes from Microsoft’s own disclosure and follow-on reporting rather than an independently verified benchmark release. Microsoft said MDASH found 16 new vulnerabilities in the Windows networking and authentication stack, including four critical remote-code-execution flaws, ahead of that month’s Patch Tuesday. GeekWire reported May 13 that the CyberGym scores cited by Microsoft, Anthropic and OpenAI were self-reported and had not been independently verified. ### What exactly did Microsoft announce? Microsoft said on May 12 that MDASH stands for “multi-model agentic scanning harness” and was built by its Autonomous Code Security team. The company said the system orchestrates more than 100 specialized AI agents across frontier and distilled models to discover, debate and prove exploitable software bugs end to end. (youtube.com) Taesoo Kim, a Microsoft vice president for Agentic Security, wrote in the company’s blog post that MDASH helped researchers find 16 new Windows vulnerabilities. Microsoft said those included flaws in the Windows kernel TCP/IP stack and the IKEv2 service. (microsoft.com) ### What is the “beats Mythos” claim based on? Microsoft said MDASH scored 88.45% on the CyberGym benchmark, which it described as covering 1,507 real-world vulnerabilities. The company said that result put MDASH at the top of the leaderboard and roughly five points ahead of the next entry. (microsoft.com) GeekWire reported on May 13 that Anthropic’s Mythos Preview scored 83.1% and OpenAI’s GPT-5.5 scored 81.8% on the same benchmark. GeekWire also reported that CyberGym was developed by UC Berkeley researchers and draws from 188 open-source software projects. ### Why are people linking this to OpenAI? (microsoft.com) The May 15 YouTube video framed MDASH as evidence that Microsoft is becoming a more independent AI competitor rather than only OpenAI’s partner. The video’s searchable preview text says Microsoft “revealed MDASH” and that it beat Anthropic’s Mythos Preview and OpenAI’s GPT-5.5 on CyberGym. Microsoft’s broader AI posture has also been under scrutiny this year. (geekwire.com) The company’s official blog listed an April 27 post titled “The next phase of the Microsoft-OpenAI partnership,” showing that the relationship remains active even as Microsoft expands its own internal AI and agent products. ### How much of this has been independently verified? GeekWire reported that the CyberGym leaderboard scores were self-reported by the companies and that no independent party had verified them. (youtube.com) That caveat matters because the strongest public claim in circulation — that Microsoft’s new AI “beats Mythos” — rests on benchmark figures that, as of May 16, 2026, were disclosed by the companies involved rather than by an outside auditor. (blogs.microsoft.com) Microsoft’s own post presented additional internal test results, including 21 of 21 planted vulnerabilities found with zero false positives on a private test driver, 96% recall against five years of confirmed Microsoft Security Response Center cases in clfs.sys, and 100% in tcpip.sys. Those figures were published by Microsoft, not by an independent evaluator. (geekwire.com) ### Where does MDASH fit inside Microsoft’s security business? Microsoft said MDASH is being used by its security engineering teams and is being tested with a small set of customers in a limited private preview. The company tied the system to its wider push into AI-driven security products, including Security Copilot and Agent 365. (microsoft.com) Microsoft’s Security Blog said readers can sign up to join the MDASH private preview. Microsoft’s April 30 security update post also pointed readers to Microsoft Build on June 2-3, 2026, in San Francisco for more announcements from Microsoft Security experts. (microsoft.com)