AI infrastructure is getting pricier
- Memory and AI-chip demand are outstripping supply, pushing infrastructure costs higher for AI workloads. - SK Hynix posted record profit and said AI-chip demand exceeds capacity, while Nvidia backed Vast Data and expanded Google Cloud ties. - Higher infrastructure costs tighten vendor economics, so buyers and SEs will ask hard questions about inference cost and deployment choices (reuters.com) (cnbc.com) (artificialintelligence-news.com)
Running an artificial intelligence model takes chips, memory and networking, and all three are getting harder to buy in volume. SK Hynix said on April 23 that demand for AI memory will exceed manufacturing capacity, a sign that supply is still tight even after a year of data-center buildouts. (reuters.com) (news.skhynix.com) SK Hynix, one of Nvidia’s key memory suppliers, reported first-quarter revenue of 17.64 trillion won and operating profit of 7.44 trillion won, up 42% and 406% from a year earlier. The company said high-bandwidth memory, the fast memory stacked next to AI processors, drove the results as cloud customers kept spending on AI servers. (reuters.com) (cnbc.com) High-bandwidth memory works like a short, wide on-ramp feeding data into a graphics processor, and AI chips need a lot of it to train models and answer prompts quickly. SK Hynix said customer demand is already above available supply capacity, while CNBC reported analysts see memory shortages lasting for years because new capacity is slow and expensive to add. (news.skhynix.com) (cnbc.com) The spending is spreading beyond chips into the rest of the stack. Vast Data, which sells software for storing and moving the huge datasets that feed AI systems, said on April 22 that it raised $1 billion at a $30 billion valuation, with Nvidia among the backers. (cnbc.com) Cloud providers are answering with bigger systems and promises of lower running costs. Google Cloud and Nvidia said this week they are expanding their partnership around new A5X instances based on Nvidia’s Vera Rubin platform, while Nvidia said those systems are aimed at lower inference cost per token and higher throughput per watt than the prior generation. (cloud.google.com) (blogs.nvidia.com) Google is also pushing its own custom chips for the same reason. CNBC reported on April 22 that Google introduced new Tensor Processing Units for both training and inference, with the company arguing that lower-cost inference matters as customers shift from building models to running them at scale. (cnbc.com) That leaves buyers comparing not just model quality, but the cost of each answer. When memory prices stay high and top-end graphics processors remain constrained, vendors have less room to subsidize usage, and customers have more reason to ask whether a workload belongs on Nvidia chips, custom chips, or smaller models. (reuters.com) (cnbc.com) For now, the market signal is simple: the companies selling the plumbing are printing cash, and the companies buying it are still racing to lock in supply. Until memory and accelerator capacity catch up, the price of building and running AI systems is likely to stay under pressure. (reuters.com) (cnbc.com)