H100 rentals jump to $2.35/hr
- SemiAnalysis’ March 2026 index put Nvidia H100 spot rentals at $2.82 per GPU-hour, while Runpod listed H100 PCIe on-demand capacity at $2.39. - SemiAnalysis showed H100 spot pricing near $2.79 in July 2025, then $2.83 in February 2026, with on-demand listings marked sold out in February. - Cheap inference alternatives are expanding as startups shift to open models and brokered capacity markets. (techcrunch.com)
Nvidia H100 rentals are expensive again, but the market is split: SemiAnalysis pegged March 2026 spot pricing at $2.82 per GPU-hour, while Runpod advertised H100 PCIe from $2.39. (semianalysis.com) (runpod.io) SemiAnalysis’ pricing index showed H100 spot rates falling from $6.57 in the second half of 2023 to $2.79 in July 2025, then climbing back to $2.83 in February 2026 and $2.82 in March 2026. (semianalysis.com) That same SemiAnalysis table marked H100 on-demand capacity as sold out in February and March 2026, while one-month contract pricing widened to roughly $2.00 to $2.70 per GPU-hour in March. (semianalysis.com) An H100 is Nvidia’s Hopper-generation data center graphics processor, the workhorse many companies use to train models and serve responses after a user sends a prompt. Runpod’s current menu lists H100 PCIe at $2.39 an hour, H100 SXM at $2.99, and H100 NVL at $3.07. (runpod.io) Those differences matter because “H100” is not one single product in the rental market. PCIe, SXM, and NVL versions have different memory layouts, power envelopes, and packaging, so posted prices vary by configuration and provider. (runpod.io 1) (runpod.io 2) The cheapest visible H100 listings are lower on marketplace platforms. Vast.ai showed H100 PCIe at $1.53 an hour on April 27, 2026, underscoring how brokered capacity can undercut managed clouds. (vast.ai) Startups are feeling that spread most acutely in inference, the stage where a model answers live requests and every extra cent hits gross margin. Parasail chief executive Mike Henry told TechCrunch his company now handles 500 billion tokens a day across rented and owned capacity. (techcrunch.com) TechCrunch also reported that Elicit chief executive Andreas Stuhlmüller has shifted more work to open models because sending hundreds of thousands of requests to external application programming interfaces had become too costly. (techcrunch.com) This squeeze is not new. TechCrunch reported in October 2024 that Andreessen Horowitz built its Oxygen cluster after founders said they were being deprioritized for H100 access by larger cloud customers. (techcrunch.com) The upshot in April 2026 is a market with no single H100 price, but a clear pattern: scarce top-tier capacity still commands a premium, and startups are increasingly mixing brokers, neoclouds, and smaller open-model stacks to keep inference bills down. (semianalysis.com) (techcrunch.com)