Nvidia's AI Revenue Hits 91%; New Chip Planned

Nvidia reported that 91% of its $215.9 billion revenue now comes from AI. The company is also reportedly developing a new processor aimed at further accelerating AI workloads for top customers like OpenAI, underscoring its strategy to maintain its performance lead.

Nvidia's Data Center division is the powerhouse, pulling in $62.3 billion in the last quarter alone, a 75% increase year-over-year. This surge is fueled by hyperscalers and enterprises racing to build out their AI infrastructure. The company's gross margins are hovering around an impressive 75%, showcasing significant pricing power. The build-versus-buy decision for AI compute is intensifying as hyperscalers develop their own custom silicon to optimize costs and performance for their specific workloads. Google's TPUs, Amazon's Trainium and Inferentia chips, and Microsoft's Maia accelerators are all designed to reduce reliance on third-party vendors for their massive AI deployments. For instance, Microsoft's new Maia 200 is built on a 3nm process and is aimed at improving the token generation economics for services like Azure OpenAI. This move toward custom ASICs isn't just about cost savings; it's about vertical integration to tailor hardware for specific AI models and services. Amazon is using its Trainium chips to train Anthropic's Claude models, demonstrating that custom silicon can handle frontier AI workloads. Similarly, Microsoft is leveraging its Maia 200 for internal workloads like Copilot and for synthetic data generation to improve its own models. Despite the rise of in-house chips, Nvidia's CUDA software ecosystem provides a powerful moat, creating high switching costs for developers. While hyperscaler silicon is optimized for internal, high-volume inference tasks, Nvidia GPUs are often still the default for cutting-edge model training and workloads that require multi-cloud flexibility. The competitive landscape extends beyond hyperscalers, with startups making significant inroads. Cerebras is pushing the boundaries with its wafer-scale engine, the WSE-3, which boasts 4 trillion transistors and 900,000 AI cores on a single piece of silicon. Groq is gaining traction with its Language Processing Unit (LPU), an ASIC specifically designed for high-speed, low-latency inference, delivering significant performance gains over GPUs for language models. AMD is also a formidable competitor with its Instinct MI300 series, aiming to capture a larger share of the data center market. Intel is competing with its Gaudi 3 AI accelerators, which are being offered on IBM Cloud and are designed for both training and inference workloads. This increasing competition is leading to a segmentation of the AI chip market, with different architectures being optimized for specific tasks like training versus inference. OpenAI is pursuing its own custom silicon strategy through a major partnership with Broadcom. The collaboration aims to embed AI model knowledge directly into the hardware, potentially reducing inference costs by 30-40% for large-scale deployments. This move signals a deeper integration of model development and hardware design to optimize performance and efficiency. The market for AI chips is projected to grow significantly, with some estimates suggesting it could reach nearly $565 billion by 2032. While GPUs are expected to maintain a large market share, the demand for custom ASICs is growing rapidly, particularly for inference workloads. The Asia-Pacific region is anticipated to be the fastest-growing market for AI chipsets.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.