Nvidia customers build their own chips

- Nvidia’s biggest cloud customers are increasingly pairing Nvidia systems with in-house chips, as Amazon, Google, Microsoft and Meta expand custom AI silicon programs. - Nvidia said in its fiscal 2026 annual report that one direct customer accounted for 22% of revenue and another 14%. (marketscreener.com) - Google offers TPU7x on Google Cloud, while AWS, Microsoft and Meta continue rolling out Trainium, Inferentia, Maia and MTIA programs. (docs.cloud.google.com)

Nvidia’s largest customers are no longer buying only Nvidia chips. Amazon Web Services, Google, Microsoft and Meta have all built internal AI accelerators for parts of their own workloads, especially inference — the stage where trained models generate answers, rank content or serve applications in production. Official product pages from those companies now describe families of purpose-built chips for training or inference, while Nvidia’s fiscal 2026 annual report shows how concentrated its customer base has become. (marketscreener.com) The shift does not mean hyperscalers have stopped buying Nvidia hardware. (docs.cloud.google.com) Nvidia’s latest annual report said Blackwell products made up the majority of its data center revenue in fiscal 2026, and the company still supplies the general-purpose GPU systems used broadly across training and inference. But the same cloud groups that helped drive Nvidia’s rise are also investing in chips they control more directly, with software, networking and data-center design tuned around specific workloads. ### Why are Nvidia customers designing their own chips now? (aws.amazon.com) Inference is where custom silicon has the clearest opening. AWS says Inferentia is designed to deliver “high performance at the lowest cost” in Amazon EC2 for deep learning and generative AI inference, while Microsoft describes Maia 200 as an inference accelerator aimed at changing the economics of large-scale AI. Google says its latest TPU7x, or Ironwood, is designed for large-scale AI training and inference. Power, memory and utilization are part of that calculation. AWS says Trainium3 was built for “token economics” in agentic, reasoning and video-generation applications, and says Trn3 systems deliver more than 4x better energy efficiency than Trn2 UltraServers. (marketscreener.com) Google’s TPU7x page lists 192 GiB of HBM per chip and 7.38 TB/s of HBM bandwidth, figures the company ties to large-scale dense and mixture-of-experts models and decode-heavy inference. ### Which companies have the most visible in-house chip programs? Amazon has two separate lines. AWS says Trainium is its family for training and inference at scale, while Inferentia is its lower-cost inference line for EC2 customers deploying production models. (aws.amazon.com) AWS also says Trn2 instances offer 30% to 40% better price performance than GPU-based EC2 P5e and P5en instances. Google has the longest-running custom AI chip effort among the hyperscalers. Google Cloud says TPUs are custom-designed accelerators built for AI workloads, and its TPU7x documentation calls Ironwood the company’s seventh-generation TPU. (aws.amazon.com) The TPU7x pod configuration reaches 9,216 chips, according to the same documentation. Microsoft and Meta are further along than one-off experiments. Microsoft introduced Maia 100 as its first chip for large language model training and inferencing in the Microsoft Cloud, and it now describes Maia 200 as its next major AI infrastructure milestone for inference. Meta said in April 2024 that it was introducing the next generation of MTIA, its Meta Training and Inference Accelerator, and later said MTIA was deployed at scale in its data centers, primarily for ads workloads. (aws.amazon.com) ### Does this threaten Nvidia’s business right away? (cloud.google.com) Nvidia still dominates the spending pool that matters most today. The company said in its fiscal 2026 annual report that one direct customer represented 22% of total revenue and another represented 14%, both primarily in compute and networking. The filing also said Blackwell architecture products represented the majority of data center revenue. Those numbers show two things at once. Nvidia remains central to AI infrastructure purchases, but some of its biggest buyers are large enough to justify internal silicon programs alongside Nvidia deployments. (news.microsoft.com) Company product pages from AWS, Google, Microsoft and Meta show those programs are now part of mainstream cloud and platform road maps, not research side projects. ### Where does Nvidia still keep the advantage? Nvidia still sells the broadest stack. Its annual report describes the company as a provider of data center compute and networking platforms, and hyperscalers continue to buy Nvidia GPUs for large training clusters and general-purpose AI capacity. (marketscreener.com) Custom chips, by contrast, are usually designed around narrower internal needs, then exposed through a company’s own cloud or applications. Software remains part of that gap. AWS ties Trainium and Inferentia to its Neuron SDK, Google ties TPU7x access to Google Kubernetes Engine, and Microsoft describes Maia as part of an end-to-end Azure infrastructure approach. (aws.amazon.com) Those details show why the competition is not only chip against chip, but stack against stack. ### What should readers watch next? Google says TPU7x access goes through account teams and GKE deployments, while AWS is pushing Trn3 UltraServers and Inferentia-based instances through EC2. Microsoft is publishing Maia 200 architecture materials, and Meta has said MTIA is already serving production ads workloads in its data centers. (stocklight.com) The next hard evidence will come from spending disclosures and cloud product rollouts. Nvidia’s next quarterly filings will show whether customer concentration changes from the 22% and 14% levels disclosed for fiscal 2026 (docs.cloud.google.com) (aws.amazon.com)

Nvidia customers build their own chips

Get your own daily briefing