Broadcom–Google deal flags silicon shift
Coverage of a new five‑year Broadcom–Google agreement is being read as evidence that hyperscalers are building bespoke AI infrastructure and trying to diversify away from a single GPU vendor. Analysts quoted in the reporting argue cloud providers will keep investing in custom silicon and networking because inference economics favour specialized stacks at scale. For serving teams, that suggests hardware‑agnostic portability could become a competitive advantage. (aol.com)
Broadcom disclosed on April 6 that Google signed a long-term deal for Broadcom to develop future Tensor Processing Units, which are Google’s in-house artificial intelligence chips, and to supply networking parts for Google’s next artificial intelligence racks through 2031. That is not a one-off chip order; it is a reservation on the plumbing of Google’s next five years of data centers. (sec.gov) A Tensor Processing Unit is a custom chip built for neural networks, the math engines behind tools like Gemini. A general-purpose graphics processing unit is more like a Swiss Army knife, while a Tensor Processing Unit is more like a factory machine built to stamp out one part all day. (cloud.google.com) Google has been building these chips for years, and its latest public push is aimed hard at inference, which is the stage where a trained model answers real user requests. Google says its Ironwood Tensor Processing Unit is built for high-throughput, low-latency inference, which is the expensive part once millions of people start using a model every day. (cloud.google.com) Inference is where the economics change. Google’s Cloud TPU documents say serving jobs prioritize latency service-level objectives, and its earlier TPU v5e launch said the platform was designed for lower-cost inference at scale, which is why cloud companies keep chasing chips tuned to their own workloads. (docs.cloud.google.com) (cloud.google.com) The Broadcom filing also tied in Anthropic, which will get access starting in 2027 to about 3.5 gigawatts of next-generation Tensor Processing Unit compute through Google’s system. A gigawatt is power-plant language, and using it here tells you the industry is no longer talking about chip samples or pilot clusters. (sec.gov) Broadcom is not just doing compute chips in this arrangement. The same filing says it will supply networking and other components for Google’s next-generation racks, which means the value is shifting from a single processor to the full box of chips, links, switches, and memory paths that move model traffic around the building. (sec.gov) That networking piece matters because artificial intelligence systems break when the chips wait on each other. Broadcom has spent the last two years pushing Ethernet-based artificial intelligence fabrics, including Jericho3-AI for clusters up to 32,000 graphics processing units and newer 800-gigabit network cards aimed at dense artificial intelligence racks. (broadcom.com 1) (broadcom.com 2) Wall Street read the Google contract as proof that hyperscalers still want custom silicon instead of buying everything from Nvidia forever. Reuters reported the deal runs through 2031, and CNBC said Broadcom shares jumped more than 6% as analysts argued it gave Broadcom clearer demand visibility with its biggest customer. (reuters.com) (cnbc.com) This does not mean Nvidia disappears. It means the largest cloud companies are building a two-track system: buying general-purpose graphics processing units where they need flexibility, while pouring billions into custom chips where the workload is repetitive enough to justify a dedicated machine. (cloud.google.com) (reuters.com) For software teams, the practical takeaway is ugly and simple. If your model serving stack only runs well on one vendor’s hardware, you are tying your costs, capacity, and launch schedule to one supplier at the exact moment Google, Broadcom, Anthropic, and others are building a more fragmented map of chips and networks. (docs.cloud.google.com) (sec.gov)