Google Unveils 8th-Gen TPUs at Cloud Next: AI Training 3x Faster at 80% Lower Cost Per Dollar
Google announced its eighth-generation tensor processing units (TPUs) at Cloud Next 2026 on April 22 — the most significant chip upgrade in the company's AI infrastructure history. The new TPU family splits into two tiers: a high-performance cluster chip and a cost-optimized inference chip, both built for different stages of the AI workload. The headline numbers are striking: up to 3x faster AI model training compared to the previous generation, 80% better performance per dollar, and the ability to coordinate more than one million TPUs in a single cluster for the largest training runs in the world. For businesses running AI at scale on Google Cloud, this isn't just a speed story — it's a cost story. If training a custom model previously took 10 days and cost $50,000, the new TPUs bring that to roughly 3 days and $10,000. Those savings compound quickly when you're running dozens of fine-tuning jobs or continuous model evaluation pipelines. Google also confirmed it will continue supporting NVIDIA GPUs on its platform for customers with existing workloads. The new TPUs don't replace NVIDIA in the market — they give Google Cloud customers a cheaper, faster alternative for workloads that can be ported. The broader context: Google processes more than 16 billion AI tokens per minute via direct API use by cloud customers, up from 10 billion last quarter. The new chips are what makes that growth sustainable without runaway infrastructure costs. Availability begins in Q2 2026 for select Google Cloud customers, with general availability expected in Q3.
Read original article →