Nvidia inference share could drop to 20‑30%

- SemiAnalysis published a report forecasting Nvidia's AI inference market share dropping from 90%+ today to 20-30% by 2028 as custom ASICs gain traction in specialized workloads. - Key drivers include robotics firms like Figure AI and automotive giants like Tesla adopting domain-specific chips that outperform Nvidia GPUs on cost and efficiency for inference tasks. - This shift threatens Nvidia's $100B+ inference revenue stream, boosting custom silicon players like Broadcom while forcing Nvidia to pivot toward software and high-end training dominance.

Nvidia dominates AI chips today — but a new report says its grip on the fastest-growing part of that market, inference, could slip dramatically. Inference means running trained AI models to generate outputs, like ChatGPT answers or robot decisions. It's exploding as AI moves from labs to real-world apps. The catch: specialized chips called ASICs are eating Nvidia's lunch in key areas, potentially slashing its share to 20-30% by 2028. ### What's inference, and why does Nvidia own it now? Inference is the "using" phase of AI — feeding data into a model and getting predictions back. Training builds the model; inference deploys it millions of times. Nvidia GPUs excel here because they're flexible, with CUDA software locking in developers. Today, Nvidia claims 90%+ of inference workloads, powering data centers for OpenAI, Google, and more. But flexibility comes at a cost — GPUs burn power and cash on repetitive inference runs. ### What are ASICs, and why are they invading? ASICs are custom chips built for one job — like inference on a specific model. No flexibility, but they're cheaper, smaller, and way more efficient. Think of GPUs as Swiss Army knives; ASICs are scalpels. Companies design them for exact workloads, slashing power use by 5-10x and costs by 3-5x. Once a model is frozen, why pay Nvidia premiums? The rise of "model surgery" — tweaking fixed models — makes ASICs viable even for updates. ### Where's this hitting Nvidia first? Robotics and automotive lead the charge. Figure AI's humanoid robots use custom inference silicon for real-time decisions — Nvidia GPUs would drain batteries too fast. Tesla's Dojo trains models but runs inference on its own HW4/HW5 chips in cars, optimized for vision tasks. Edge devices like phones and drones follow, where power trumps everything. By 2028, SemiAnalysis predicts these sectors alone flip 60-70% of inference away from Nvidia. ### Why can't Nvidia just compete on price? They could — but margins would tank. Nvidia's moat is high-end GPUs like Blackwell, grossing 80%+. ASICs let customers like Amazon (Trainium) or Grok (xAI) build cheaper alternatives. Broadcom already wins big designing ASICs for hyperscalers. Nvidia pushes back with NVLink networking and software, but inference clusters don't need it like training does. Turns out, 80% of future inference revenue could vanish if ASICs scale. ### How big is the inference pot, anyway? Massive — $100B+ annually by 2028, dwarfing training's $50B. Inference tokens served could hit 1e28 per year, per Epoch AI. Nvidia banks on this for growth post-training boom. But if share drops to 20-30%, revenue mix flips: training stays dominant, inference secondary. Stock implications? Analysts like Gene Munster see Nvidia adapting via software subs, but custom silicon vendors soar. ### What's Nvidia doing to fight back? Blackwell GPUs target inference efficiency — 4x better perf/watt. CEO Jensen Huang bets on full-stack control: chips, networking, CUDA. Partnerships with ASIC makers help too. But the report warns: domain-specific wins erode this. Nvidia's response? Double down on training supremacy and agentic AI needing GPU flexibility. Still, robotics proves ASICs work at scale. ### Who wins if Nvidia slips? Custom silicon surges — Broadcom, Marvell, and startups like Groq. Hyperscalers go in-house (Google TPUs already at 20% inference share). Edge players like Qualcomm thrive in phones. Nvidia survives on premium training, but valuation multiples compress. The shift raises custom vendors' profile, potentially adding $50B to their markets. Bottom line: Nvidia's inference empire faces real erosion — not tomorrow, but by 2028. ASICs conquer where efficiency rules: robots, cars, edge. Nvidia pivots to software and training, but investors watch revenue mix closely. If SemiAnalysis nails it, today's $3T giant becomes a duo-market leader — still huge, just less monopolistic. Track Figure, Tesla, and Blackwell ramps for clues. ``` (Word count: 578)

Nvidia inference share could drop to 20‑30%

Get your own daily briefing