Google, AWS and Meta build ASICs

- On May 18, 2026, Google, Amazon Web Services and Meta were already shipping or detailing custom AI chips for inference and training workloads. - Google said Ironwood, introduced April 23, 2025, was its first TPU designed specifically for inference and could scale to 9,216 chips. - Meta said March 11 it would deploy four new MTIA generations within two years, with MTIA 400, 450 and 500 next.

Google, Amazon Web Services and Meta have moved past talking about custom AI silicon as a cost-control tool and are describing it as production infrastructure for inference, training and model serving. The companies have each published product updates over the past 18 months showing in-house chips tied to named software stacks, cloud instances and deployment plans. Those disclosures give substance to industry discussion on Monday that hyperscalers are building application-specific integrated circuits, or ASICs, for AI workloads at scale. Google said on April 23, 2025 that Ironwood was its seventh-generation Tensor Processing Unit and the first TPU designed specifically for inference. AWS says its Inferentia chips are built for large-scale inference and its Trainium family now spans three generations for training and inference. Meta said on March 11, 2026 that its Meta Training and Inference Accelerator, or MTIA, sits at the center of its AI infrastructure strategy and that it is developing four new chip generations within two years. (blog.google) ### Which companies have actually named chips and deployment plans? Google has named both Trillium and Ironwood in official product posts. Trillium, its sixth-generation TPU, became generally available on December 11, 2024, and Google said it had used Trillium to train Gemini 2.0. Ironwood followed at Google Cloud Next 2025 as a chip built specifically for inference workloads. (blog.google) AWS has named Inferentia, Inferentia2, Trainium1, Trainium2 and Trainium3 across its product pages. The company says Inf2 instances are optimized to deploy large language models and diffusion models at scale, while Trn2 and Trn3 systems are aimed at generative AI training and deployment. Meta has named MTIA as its in-house family and has attached a roadmap to it. (cloud.google.com) The company said MTIA 300 is already in production for ranking and recommendations training, while MTIA 400, 450 and 500 are intended to support generative AI inference production in the near term and into 2027. ### What shows this is about inference, not only training? (aws.amazon.com) Google described Ironwood as the first TPU built specifically for inference and said it was designed for “thinking” and inferential AI models at scale. The company said Ironwood can scale to 9,216 liquid-cooled chips and is part of its AI Hypercomputer architecture. (about.fb.com) AWS says Inferentia2-based Inf2 instances are optimized for increasingly complex models, including large language models, and support distributed inference. On its Trainium page, AWS also says Trainium3 is built for “agentic, reasoning, and video generation applications,” language that places custom silicon in live model serving as well as model building. (blog.google) Meta said it already deploys hundreds of thousands of MTIA chips for inference workloads across organic content and ads in its apps. The company said later MTIA generations will be used primarily for generative AI inference production. ### How are these companies tying chips to usable systems? Google paired Trillium and Ironwood with its AI Hypercomputer stack, XLA compiler work and support for JAX, PyTorch and TensorFlow. (aws.amazon.com) The company said Trillium can scale across more than 100,000 chips on its Jupiter network fabric and cited gains in training performance, inference throughput and energy efficiency. (about.fb.com) AWS tied its chips to EC2 instance families and the Neuron software development kit. The company says Neuron integrates with PyTorch and TensorFlow, and that developers can train and deploy models on Trainium and Inferentia without rebuilding existing workflows from scratch. Meta said its newer MTIA chips are modular enough to drop into existing rack infrastructure, which it said shortens time to production. (cloud.google.com) In April, Meta also said Broadcom would work with it on chip design, packaging and networking for multiple generations of MTIA. ### Where does OpenAI fit in? OpenAI was part of Monday’s market discussion, but the company has not published an equivalent official product roadmap for in-house ASICs on the sources reviewed here. (aws.amazon.com) Reuters-backed reporting carried by other outlets in October 2025 said OpenAI and Broadcom were working on custom AI accelerators for deployment starting in 2026, but that sits outside the company disclosures available from Google, AWS and Meta. (about.fb.com) April 14, 2026 is the clearest next milestone in the official record: Meta said its Broadcom partnership begins with a commitment exceeding 1 gigawatt and is the first phase of a multi-gigawatt rollout. Meta also said MTIA 400, 450 and 500 are slated for generative AI inference production into 2027, while Google and AWS continue to market Ironwood, Trillium, Inferentia and Trainium through their cloud platforms. (engadget.com) (about.fb.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.