Compute concentration with Google
A report finds more than 60% of global AI compute capacity sits with hyperscalers and highlights Google as the largest owner of AI compute, raising questions about pricing power and access. That concentration increases the strategic value of provider abstraction layers and platform‑level hedges. For organisations planning long‑term GenAI platforms, it means architecture should anticipate volatility in availability and cost. (networkworld.com)
A new count of the world’s artificial intelligence hardware says more than 60% of global capacity now sits inside a handful of hyperscalers, and Google is the single biggest owner of that compute. The estimate cited by Network World puts Google at roughly a quarter of global capacity, ahead of rivals that leaned harder on Nvidia graphics processors. (networkworld.com) That is a strange result if you followed the last two years of headlines, because most of the noise was about Nvidia’s H100 chips selling out. Google got to the front by using millions of its own tensor processing units, which are custom chips it has been designing for years instead of buying most of its fleet from Nvidia. (networkworld.com) (cloud.google.com) A hyperscaler is just a cloud company with data centers so large they work like industrial utilities. When 3 or 4 of those firms control most of the machines needed to train and run large language models, access to artificial intelligence starts to look less like buying servers and more like renting factory time from a few landlords. (networkworld.com) (oecd.org) Google’s advantage did not come from one chip alone. Its Cloud TPU v5p system is sold as part of an “artificial intelligence hypercomputer” stack that combines chips, networking, software, and scheduling so customers can train and serve models on giant clusters instead of stitching the parts together themselves. (cloud.google.com 1) (cloud.google.com 2) Those clusters are enormous by normal computing standards. Google says one TPU v5p Pod contains 8,960 chips linked with high-speed interconnects, which is the kind of scale needed for frontier model training and for serving millions of prompts after the model goes live. (docs.cloud.google.com) Amazon is trying the same playbook from a different angle. Its Trainium2 systems are custom Amazon chips rather than Nvidia chips, and Amazon says its Trn2 instances deliver 30% to 40% better price performance than certain graphics-processor-based Elastic Compute Cloud offerings. (aws.amazon.com 1) (aws.amazon.com 2) That changes the cloud market in a very specific way. If the biggest providers own the chips, the networking, the software layer, and the spare capacity, then they can compete on price one quarter and on availability the next, and customers cannot assume the cheapest model today will still be the cheapest place to run it next year. (networkworld.com) (semianalysis.com) It also explains why “provider abstraction” suddenly matters. That phrase means building your applications so the model, the vector database, and the orchestration layer can move between Google Cloud, Amazon Web Services, Microsoft Azure, or an outside application programming interface without rewriting the whole product each time. (networkworld.com) The reason companies bother with that extra engineering is simple: capacity shocks are real. SemiAnalysis tied artificial-intelligence data-center demand to about 90 terawatt-hours of power by 2026, and when electricity, networking gear, and advanced chips all tighten at once, the bottleneck is not your software team but whether your provider can actually give you machines. (semianalysis.com) So the Google story is not only that one company owns a huge share of the world’s artificial intelligence compute. It is that the winners are starting to look like the firms that can build a full stack from silicon to software, which leaves everyone else planning around a market where cost and access can swing with the decisions of a few very large clouds. (networkworld.com) (cloud.google.com)