Microsoft Unveils Maia 200 AI Chip
Microsoft has announced its latest custom AI accelerator, the Maia 200. The move signals the company's continued investment in proprietary silicon to reduce reliance on third-party vendors and optimize for its own large-scale AI training and inference workloads. This development places Microsoft in direct competition with Google's TPU and Amazon's Trainium/Inferentia chips.
- The Maia 200 is an inference-focused accelerator built on TSMC's 3-nanometer process, featuring over 140 billion transistors. It incorporates 216GB of HBM3e memory with 7 TB/s of bandwidth and delivers over 5 petaFLOPS of FP8 performance within a 750W TDP. - Microsoft claims the Maia 200 offers three times the performance of Amazon's third-generation Trainium chip in 4-bit precision (FP4) and outperforms Google's seventh-generation TPU in 8-bit precision (FP8) computations. This positions it competitively for running the largest AI models, with the company stating it provides 30% better performance-per-dollar than its predecessor, the Maia 100. - This chip is part of a full-stack system design that includes custom server boards, racks, and a specialized liquid-cooling system to manage the thermal output. It is designed to work in conjunction with Microsoft's own Azure Cobalt CPU, a 128-core processor based on Arm's Neoverse N2 design, to optimize the entire infrastructure stack for AI workloads. - For networking, Maia 200 systems utilize a two-tier scale-up network built on standard Ethernet, capable of connecting clusters of up to 6,144 accelerators. This approach avoids proprietary fabrics like InfiniBand and leverages Microsoft's role in the Ultra Ethernet Consortium. - Microsoft's silicon strategy is heavily influenced by its partnership with OpenAI, which is co-developing custom chips with Broadcom. Through a revised agreement, Microsoft gains intellectual property rights to OpenAI's chip designs through 2032, allowing it to adapt and extend these innovations for its own hardware roadmap. - The Maia 200 will be used to power Microsoft's internal services, including Microsoft 365 Copilot and the Azure Foundry, as well as for OpenAI's GPT-5.2 models. Microsoft's Superintelligence team will also use the chip for synthetic data generation and reinforcement learning to improve future models.