Microsoft Unveils Maia 200 AI Chip

Microsoft is making a major play for AI hardware independence, unveiling its new Maia 200 AI chip. The move is designed to reduce Azure's reliance on Nvidia and AMD, giving Microsoft more control over its AI supply chain and pricing. Early analysis suggests the chip is optimized for LLM inference, focusing on power efficiency and cost-per-token — a clear sign that hyperscalers are building custom silicon for their most strategic workloads.

The Maia 200 is fabricated on TSMC's 3nm process, a significant leap from the 5nm process used for its predecessor, Maia 100. This new chip packs over 140 billion transistors and is paired with 216GB of HBM3e memory, delivering over 10 petaFLOPS of 4-bit precision (FP4) performance, a spec squarely aimed at accelerating inference for next-gen models like OpenAI's GPT-5.2. This move is part of a years-long strategy, with Microsoft's internal AI chip development, codenamed "Project Athena," starting as far back as 2019. The first-generation Maia 100 chip was the initial step, featuring 105 billion transistors and designed for workloads like GPT-3.5-Turbo to free up GPU capacity. Microsoft's approach extends beyond just the silicon; it's a full-stack, vertically integrated system. The Maia chips are deployed in custom server boards and racks with bespoke Ethernet-based fabric and closed-loop liquid cooling, all designed to optimize performance-per-dollar and total cost of ownership for their own massive AI services. The competitive implications are clear: Maia 200 is explicitly positioned against other hyperscaler custom silicon, with Microsoft claiming 3x the FP4 performance of Amazon's third-gen Trainium and FP8 performance exceeding Google's seventh-gen TPU. This is a direct challenge in the race to control the exploding operational costs of generative AI inference. Alongside its own chip development, Microsoft is diversifying its supply chain. While TSMC produces the Maia family, Microsoft has also inked a deal for Intel to manufacture a future chip design using its 18A process technology. This dual-foundry strategy mitigates risk and provides leverage in future negotiations. The software and GTM strategy is built around deep integration with its own services. The Maia 200 will be used by the Microsoft Superintelligence team and will power Microsoft Foundry and Microsoft 365 Copilot. A Maia SDK with PyTorch integration and a Triton compiler is being previewed to optimize models for the new hardware.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.