Google launches TPU 8 chips

Published by The Daily Scout

What happened

- Google introduced new TPU 8 chips this week aimed at both training and inference workloads. - The announcement included TPU 8t for training and TPU 8i for inference specifically. - Hyperscalers building faster in‑house silicon reinforces demand for IP and faster custom‑design paths. (techcrunch.com)

Why it matters

Google split its newest artificial intelligence chip line in two on April 22, introducing one processor for training models and another for running them in production. (blog.google) The chips are called TPU 8t and TPU 8i, and Google announced them at its Cloud Next 2026 conference. Google said both will be generally available later this year through Google Cloud. (cloud.google.com) Training is the stage where a model learns from huge data sets; inference is the stage after that, when the model answers prompts or takes actions for users. Google said newer AI systems need different hardware for those two jobs, especially as “agents” handle longer, multi-step tasks. (cnbc.com) Google said TPU 8t is built for large training runs and can scale to 9,600 chips in one superpod using the company’s 3D torus network. The company said TPU 8i is tuned for low-latency inference and reinforcement learning, with Arm-based Axion central processors attached across the system to keep data moving. (cloud.google.com) The company’s pitch is speed and cost. Google told customers TPU 8t can deliver up to 3x faster model training, while TPU 8i offers about 80% better performance per dollar than the prior Ironwood generation at low-latency targets. (techcrunch.com) Google also changed the memory mix for inference. CNBC reported TPU 8i carries 384 megabytes of static random-access memory, triple the amount in Ironwood, a design choice aimed at serving models faster. (cnbc.com) This is Google’s first flagship TPU generation split into separate training and inference chips. The company has used its own AI processors internally since 2015 and began renting TPUs to cloud customers in 2018. (cnbc.com) Google is not replacing Nvidia across its cloud. TechCrunch reported Google still plans to offer Nvidia’s Vera Rubin systems later this year and is also working with Nvidia on networking software called Falcon. (techcrunch.com) Amazon Web Services has already split its custom AI silicon between Inferentia for inference and Trainium for training, and Microsoft announced a second-generation AI chip in January. Google’s move puts it more squarely in the same playbook as other cloud providers building in-house chips for specific workloads. (cnbc.com) The immediate test is whether cloud customers buy enough of those specialized systems to shift spending away from general-purpose graphics processors. For now, Google is expanding its own silicon while still selling Nvidia’s. (techcrunch.com)

Key numbers

  • Google introduced new TPU 8 chips this week aimed at both training and inference workloads.
  • The announcement included TPU 8t for training and TPU 8i for inference specifically.
  • (techcrunch.com) Google split its newest artificial intelligence chip line in two on April 22, introducing one processor for training models and another for running them in production.
  • (blog.google) The chips are called TPU 8t and TPU 8i, and Google announced them at its Cloud Next 2026 conference.

What happens next

  • (blog.google) The chips are called TPU 8t and TPU 8i, and Google announced them at its Cloud Next 2026 conference.
  • Google said both will be generally available later this year through Google Cloud.
  • Google told customers TPU 8t can deliver up to 3x faster model training, while TPU 8i offers about 80% better performance per dollar than the prior Ironwood generation at low-latency targets.

Quick answers

What happened in Google launches TPU 8 chips?

Google introduced new TPU 8 chips this week aimed at both training and inference workloads. The announcement included TPU 8t for training and TPU 8i for inference specifically. Hyperscalers building faster in‑house silicon reinforces demand for IP and faster custom‑design paths. (techcrunch.com)

Why does Google launches TPU 8 chips matter?

Google split its newest artificial intelligence chip line in two on April 22, introducing one processor for training models and another for running them in production. (blog.google) The chips are called TPU 8t and TPU 8i, and Google announced them at its Cloud Next 2026 conference. Google said both will be generally available later this year through Google Cloud. (cloud.google.com) Training is the stage where a model learns from huge data sets; inference is the stage after that, when the model answers prompts or takes actions for users. Google said newer AI systems need different hardware for those two jobs, especially as “agents” handle longer, multi-step tasks. (cnbc.com) Google said TPU 8t is built for large training runs and can scale to 9,600 chips in one superpod using the company’s 3D torus network. The company said TPU 8i is tuned for low-latency inference and reinforcement learning, with Arm-based Axion central processors attached across the system to keep data moving. (cloud.google.com) The company’s pitch is speed and cost. Google told customers TPU 8t can deliver up to 3x faster model training, while TPU 8i offers about 80% better performance per dollar than the prior Ironwood generation at low-latency targets. (techcrunch.com) Google also changed the memory mix for inference. CNBC reported TPU 8i carries 384 megabytes of static random-access memory, triple the amount in Ironwood, a design choice aimed at serving models faster. (cnbc.com) This is Google’s first flagship TPU generation split into separate training and inference chips. The company has used its own AI processors internally since 2015 and began renting TPUs to cloud customers in 2018. (cnbc.com) Google is not replacing Nvidia across its cloud. TechCrunch reported Google still plans to offer Nvidia’s Vera Rubin systems later this year and is also working with Nvidia on networking software called Falcon. (techcrunch.com) Amazon Web Services has already split its custom AI silicon between Inferentia for inference and Trainium for training, and Microsoft announced a second-generation AI chip in January. Google’s move puts it more squarely in the same playbook as other cloud providers building in-house chips for specific workloads. (cnbc.com) The immediate test is whether cloud customers buy enough of those specialized systems to shift spending away from general-purpose graphics processors. For now, Google is expanding its own silicon while still selling Nvidia’s. (techcrunch.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Published by The Daily Scout - Be the smartest in the room.