Blackwell GPU rents spike
Hourly rental rates for Nvidia’s Blackwell GPUs have jumped sharply in the last two months, pushing spot prices to $4.08 per hour — a 48% rise from $2.75 two months earlier. This surge reflects growing demand for agentic AI workloads and is already raising the operating cost of inference-heavy products. ((intellectia.ai))
Renting Nvidia’s newest Blackwell chips now costs materially more than it did in February, with spot rates rising fast enough to show up in product budgets. (techmeme.com) The Ornn Compute Price Index put Blackwell spot pricing at $4.08 an hour on April 13, up 48% from $2.75 two months earlier, according to a Wall Street Journal report aggregated by Techmeme. (techmeme.com) That jump is tied to demand for “agentic” artificial intelligence systems, which do more back-and-forth work per user request than a simple chatbot and keep graphics processing units busy for longer stretches. Nvidia says Blackwell was built for this inference work, the stage when a trained model generates answers for users. (techmeme.com) (blogs.nvidia.com) A graphics processing unit, or GPU, is the specialized chip that powers most modern artificial intelligence. Companies rent those chips by the hour from cloud providers instead of buying entire server fleets up front, so a higher hourly rate flows quickly into the cost of serving each query. (blogs.nvidia.com) (morningstar.com) Blackwell’s appeal is that it can lower the cost of each answer even when the chip itself is expensive to rent. Nvidia said in March 2024 that Blackwell could cut operating cost and energy use for large language model inference by up to 25 times versus its predecessor in some workloads. (nvidianews.nvidia.com) Nvidia published production data on February 12 showing Baseten, DeepInfra, Fireworks AI, and Together AI cutting cost per token by 4 times to 10 times on Blackwell systems running open-source models. Those examples covered healthcare, gaming, customer service, and agentic chat products. (blogs.nvidia.com) The squeeze is that lower cost per token does not guarantee lower total spending when usage is exploding. The Wall Street Journal report said some artificial intelligence companies are already rationing offerings and products as compute gets harder to secure at acceptable prices. (techmeme.com) Pricing also varies widely by provider and contract type, which helps explain why an index matters. Public cloud listings compiled in March and April showed Blackwell B200 rates ranging from about $5.98 to $6.08 an hour for single-GPU on-demand rentals at some providers, while other trackers showed lower spot prices and higher cluster pricing. (deploybase.ai) (getdeploying.com) Ornn said on April 2 that it had added its compute price index to the Bloomberg Terminal, a sign that graphics processing unit capacity is being tracked more like a traded input than a back-office expense. (morningstar.com) If Blackwell rents stay near $4 an hour or keep climbing, the next test is whether artificial intelligence companies can make each user interaction valuable enough to absorb the bill. (techmeme.com)