Blackwell GPU rental spike

Hourly rental prices for Nvidia’s Blackwell GPUs have jumped roughly 48% in two months, signalling a squeeze in usable rented compute rather than just hardware production constraints. Analysts tied the rise to demand for agentic AI workloads, which run longer and keep rental pools occupied. (alltoc.com) (intellectia.ai)

Renting one Nvidia Blackwell graphics processing unit now costs about $4.08 an hour, up 48% from $2.75 two months ago. (techmeme.com) The jump was tracked by the Ornn Compute Price Index, a benchmark that measures negotiated graphics processing unit rental prices across cloud and on-premise markets. Ornn said on April 2 that the index had been added to the Bloomberg Terminal. (ornn.com) (morningstar.com) Blackwell is Nvidia’s newest data-center chip family, built for training and running large artificial intelligence models. Nvidia says its GB200 NVL72 system links 72 Blackwell graphics processing units so they can act like one large machine. (nvidia.com 1) (nvidia.com 2) Nvidia and its cloud partners have been pitching Blackwell for “reasoning” and agentic artificial intelligence systems, which generate more tokens and stay on hardware longer than simpler chatbot requests. CoreWeave began general availability of GB200 NVL72-based instances on February 4, 2025, with Nvidia saying the setup was aimed at “AI reasoning models and agents.” (blogs.nvidia.com 1) (blogs.nvidia.com 2) The price rise points to a squeeze in usable rented compute, not only a shortage of chips leaving factories. When customers keep graphics processing units occupied for longer stretches, fewer machines cycle back into the rental pool. (techmeme.com) (the-decoder.com) The same strain is showing up in service limits and outages. The Wall Street Journal, as summarized by The Decoder on April 13, reported Anthropic’s application programming interface availability at 98.95% versus a 99.99% industry target, and said OpenAI had shut down Sora to redirect compute to coding and enterprise products. (the-decoder.com) (techcrunch.com) Usage has also climbed fast. The Wall Street Journal, again via secondary reports published April 13, said OpenAI’s application programming interface traffic rose from 6 billion tokens a minute in October to 15 billion by March. (the-decoder.com) (aidailypost.com) Nvidia says Blackwell is already “in full production,” and its DGX B200 server promises 15 times the inference performance of the prior generation in some workloads. Even with that added supply, the rental market is getting tighter as customers use more compute per task. (nvidia.com 1) (nvidia.com 2) That leaves the clearest signal in the hourly rate itself: more Blackwell hardware is reaching the market, but rented capacity is still getting harder to buy at last quarter’s price. (techmeme.com) (ornn.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.