Microsoft's MDASH finds 16 Windows flaws
- Microsoft on May 12 unveiled MDASH, an AI vulnerability-hunting system, and said it helped find 16 previously unknown Windows flaws patched this week. - Microsoft said MDASH used more than 100 specialized agents and found four critical remote-code-execution bugs in Windows networking and authentication components. - A limited private preview is underway with customers, while AISI’s GPT-5.5 evaluation remains posted on its April 30 findings.
Microsoft said on May 12 that a new internal AI system called MDASH helped its researchers find 16 previously unknown Windows vulnerabilities that were fixed in this month’s Patch Tuesday release. The company said the flaws were in Windows networking and authentication components and included four critical remote-code-execution bugs. Microsoft described MDASH as a multi-model, agent-based system that coordinates more than 100 specialized agents to scan code, test hypotheses and validate possible exploits. The announcement lands in a week when major AI developers and government evaluators have published fresh evidence that frontier models are improving at vulnerability discovery and exploitation tasks. Anthropic said in April that its Claude Mythos Preview could identify and exploit zero-day vulnerabilities across major operating systems and browsers. The U.K. AI Security Institute said on April 30 that OpenAI’s GPT-5.5 reached a similar level on its cyber evaluations and may be the strongest model it has tested on expert-level tasks. (microsoft.com) ### What exactly did Microsoft say MDASH found inside Windows? Microsoft said MDASH helped uncover 16 new vulnerabilities across the Windows networking and authentication stack. The company said four of those were critical remote-code-execution flaws, including bugs in the Windows kernel TCP/IP stack and the IKEv2 service. (red.anthropic.com) May 2026 Patch Tuesday shipped fixes for those issues as part of Microsoft’s monthly security update cycle. Third-party security summaries said Microsoft released fixes for roughly 118 to 120 CVEs this month, though counts vary by methodology, and noted that the MDASH-discovered issues were included in that batch. ### How does MDASH work differently from a single chatbot model? (microsoft.com) Microsoft said MDASH is a “multi-model agentic scanning harness,” not a single model. The company said it assigns different agents to roles such as generating attack ideas, auditing code paths, challenging earlier findings and validating exploitability, with the system drawing on both frontier and distilled models. (tenable.com) More than 100 specialized agents are used in the system, according to Microsoft and outside reports describing the launch. PCMag reported that Microsoft is using the system internally and has started previewing it with a small set of enterprise customers. ### Did Microsoft show benchmark results against Anthropic and OpenAI? Microsoft said MDASH topped a leading cybersecurity benchmark called CyberGym. (microsoft.com) Outside reports citing Microsoft said the system scored 88.45% and outperformed single-model systems from Anthropic and OpenAI on that test. The company’s claim comes as benchmark comparisons are moving quickly across labs. (pcmag.com) Anthropic’s April 7 write-up on Mythos said the model showed a “substantial leap” in cyber capability during the company’s internal testing, while the AISI said GPT-5.5 achieved a 71.4% average pass rate on its expert tasks versus 68.6% for Mythos Preview. ### What did the U.K. government’s evaluator say about GPT-5.5? (geekwire.com) The AI Security Institute said on April 30 that GPT-5.5 was the second model to complete one of its multi-step corporate network attack simulations end-to-end. AISI said the result suggested Anthropic’s earlier Mythos showing was not unique to one developer but part of a broader rise in model capability. (red.anthropic.com) AISI also said GPT-5.5 is one of the strongest models it has tested on cyber tasks. Its evaluation covered 95 narrow tasks across four difficulty tiers, including reverse engineering, web exploitation, cryptography and vulnerability research against realistic targets. ### What are companies doing with these systems now? Anthropic said in April that it had launched Project Glasswing to use Mythos Preview to help secure critical software and to prepare industry defenses. (aisi.gov.uk) Microsoft said MDASH is already being used by its own security engineering teams and is entering a limited private preview with customers. May 12 is the date attached to Microsoft’s MDASH announcement, and April 30 is the date on AISI’s GPT-5.5 evaluation. (aisi.gov.uk) Microsoft’s next public checkpoint is likely to come through future Patch Tuesday disclosures or customer updates tied to the private preview, while AISI’s current benchmark details remain posted in its published evaluation. (microsoft.com) (red.anthropic.com)