NVIDIA Readies New Chip to Accelerate AI

NVIDIA is reportedly launching a new chip specifically designed to speed up AI workloads, escalating the hardware arms race. The move puts more competitive pressure on Apple's on-device AI acceleration with its Neural Engine and SoCs. As the market for high-throughput ML and edge computing grows, expect an intensified AI feature battle at the silicon level.

NVIDIA's forthcoming chip, expected to be unveiled at its GTC developer conference in March, is reportedly an AI accelerator focused specifically on inference. The new hardware is said to incorporate technology from the recently acquired startup Groq, with OpenAI planning to be a large-scale user. The chip architecture deviates significantly from traditional GPUs by using a Language Processing Unit (LPU) design. This LPU leverages high-speed SRAM (static random access memory) embedded directly on the silicon, which can be up to 100 times faster than the High Bandwidth Memory (HBM) used in most data center GPUs. This move signals a strategic focus on the AI inference market, which is becoming a key battleground as AI models are deployed at scale. While NVIDIA's GPUs dominate the computationally intensive training phase, the inference market—where models generate answers—is seeing rising demand for more cost-effective and efficient solutions. The strategy contrasts with Apple's focus on integrated, on-device AI. Apple's latest M-series SoCs utilize a multi-core Neural Engine, a dedicated coprocessor for machine learning tasks. The Neural Engine in the M4 chip, for example, is capable of 38 trillion operations per second (TOPS), optimized for efficiency and privacy by processing data locally. A core advantage for Apple is its unified memory architecture, which allows the CPU, GPU, and Neural Engine to access the same memory pool without performance-killing data transfers. This design minimizes latency, a critical factor for the real-time, responsive AI features integrated into iOS and macOS via frameworks like Core ML. The broader AI hardware market is projected to grow exponentially, reaching an estimated value of over $690 billion by 2033. This growth is driven by two parallel trends: the expansion of massive, cloud-based AI infrastructure and the increasing demand for powerful, on-device processing in consumer electronics. NVIDIA's roadmap continues its aggressive pace beyond inference-specific chips, with the "Rubin" platform slated as the successor to its "Blackwell" architecture. The Rubin platform, first shown at CES 2026, promises another significant leap in performance for the large-scale AI systems that power cloud-based services.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.