Training costs top $100M per model
- Stanford’s AI Index and recent analyst papers say training frontier models — OpenAI’s GPT‑4 and Google’s Gemini Ultra — now hit nine‑figure compute bills. - Stanford’s work lists GPT‑4 at roughly $78M in compute and Gemini Ultra near $191M — development, testing, and safety work push totals higher. - That scale is driving hyperscalers into multibillion‑dollar capex cycles and comparisons of AI data centers to nineteenth‑century railroads.
Lede AI model training is getting absurdly expensive — and not in a one‑off way. Frontier language and multimodal models now routinely require tens to low hundreds of millions of dollars just for the compute runs that create them. The price shows up in big line items — repeated training runs, safety testing, and massive datasets — and it’s changing how companies plan capital spending. New public estimates for models like GPT‑4 and Gemini Ultra make that plain. Why are people saying “nine‑figure” costs now? Because the bill that used to live in research budgets moved into production budgets. The AI Index and multiple analyst estimates put compute for GPT‑4 in the high tens of millions and Google’s Gemini Ultra in the low hundreds of millions — those are compute-only price tags, not the full program cost. Add data work, human labeling, safety testing, and running multiple experiments and the totals climb. Which specific models hit those figures? OpenAI’s GPT‑4 gets cited around $78M of compute. Google’s Gemini Ultra shows up near $191M in public estimates. Other frontier runs — big Llama variants and closed research projects — appear in the same ballpark once you include extra training cycles. Those names are the poster children for the trend. What actually drives a $100M bill? Mostly GPU time — thousands of high‑end accelerators running for weeks or months. Then there’s the repeated retraining and hyperparameter sweeps that multiply that baseline. Add human annotation, safety and red‑team testing, cloud storage, and engineering time — those all stack up fast. New papers model the growth and show hardware and repeated runs are the biggest multipliers. How does inference compare to training on costs? Training is a headline one‑time cost — inference is the long tail. For widely used models, serving requests over months or years often dwarfs the original training spend. That’s why teams obsess over quantization, distillation, and faster chips — the operational bill can be an order of magnitude larger. Why is this changing company behavior now? Because the math demands scale. If a single model run costs nine figures, the only way to support services at global scale is vast infrastructure — data centers, racks, custom networking. That’s why analysts and execs are talking multibillion and even trillion‑dollar buildouts, and why investors compare AI data centers to railroads. Can costs be tamed? Yes — to a point. Fine‑tuning pre‑trained models, smarter training schedules, efficient hardware, and open‑source innovation cut bills dramatically. But the frontier‑performance race rewards more compute, not less — so optimizations save money but don’t erase the pressure to spend on scale and safety. Who wins and who loses from this? Winners: chip makers, cloud and niche GPU clouds, and big hyperscalers that can amortize buildouts. Losers: small AI startups that need access to frontier performance but can’t absorb nine‑figure experiments. The catch is the industry still needs competition and experimenters — and those need cheaper access to serious compute. Bottom line Training frontier models has moved from expensive curiosity to industrial cost center — nine‑figure compute runs are the new normal at the top. That’s why the AI era looks less like a software sprint and more like a multi‑decade infrastructure build — with all the strategic, financial, and policy questions that follow.