Nvidia pushes 'cost per token'

Nvidia’s shares rose on April 15 as investors bet on demand for its Blackwell AI systems, and the company is urging the industry to use “cost per token” as the main efficiency measure for AI data centers. ( ) Nvidia argues cost‑per‑token better captures hardware performance, software optimization and real‑world utilization, and industry outlets note that framing is becoming a focal point in how customers judge AI infrastructure. ( ) Analysts still warn of export restrictions and valuation risks even as spending on agentic AI and supercomputers fuels investor enthusiasm. (ibtimes.com.au)

Nvidia is trying to change how buyers judge artificial intelligence hardware, arguing that the key number is now the cost to generate tokens, not the cost to rent a graphics processor. (blogs.nvidia.com) A token is a small unit of text a model reads or writes, and Nvidia said on April 15 that “cost per token” is the total-cost-of-ownership measure that best reflects real output from an artificial intelligence data center. The company said the metric captures hardware speed, software tuning, ecosystem support and how fully machines are used in practice. (blogs.nvidia.com) That pitch landed as Nvidia shares rose on Wednesday, April 15. International Business Times Australia reported the stock at about $199.76 shortly after 11:25 a.m. Eastern Daylight Time, up $3.30, or 1.68 percent, and about 19 percent higher for April. (ibtimes.com.au) The market move tracked investor bets that Blackwell systems are moving from launch into volume deployment. International Business Times Australia said enthusiasm has centered on Blackwell production ramps, agentic artificial intelligence spending and expectations for more supercomputer demand through 2026. (ibtimes.com.au) Nvidia’s argument is that older yardsticks measure inputs, not output. Data Center Knowledge said the company is explicitly contrasting cost per token with cost per graphics processor hour and floating-point operations per dollar, which it describes as theoretical or incomplete measures for inference-heavy workloads. (datacenterknowledge.com) That framing fits a shift in the business of artificial intelligence. Data Center Knowledge said customers are increasingly buying systems to serve inference — the step where trained models answer prompts — and that pushes operators to care less about peak chip specs than about how cheaply a full stack can produce usable responses. (datacenterknowledge.com) Nvidia has been building this case for weeks with Blackwell performance claims. In a February 12 company post, Nvidia said Baseten, DeepInfra, Fireworks AI and Together AI had cut cost per token by as much as 10 times on Blackwell compared with Hopper by combining newer chips with optimized inference software. (blogs.nvidia.com) The company’s message also serves a strategic purpose: it shifts the comparison away from sticker price and toward system-level economics, where Nvidia can bundle chips, networking, software and utilization into one story. Nvidia said enterprises often focus on the numerator — graphics processor cost per hour — while missing the denominator, the number of tokens a system actually produces. (blogs.nvidia.com) Not everyone is treating the rally as risk-free. International Business Times Australia said analysts still see export restrictions and valuation as live concerns even as demand for agentic artificial intelligence infrastructure supports the stock. (ibtimes.com.au) So the fight is no longer just over who has the fastest chip. Nvidia is pushing customers to ask a narrower question: how much money it takes to get a model to produce the next million tokens. (blogs.nvidia.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.