Anthropic in talks to buy inference chips from London startup Fractile to ease compute squeeze
- Anthropic has held early talks to buy inference chips from London startup Fractile, as Claude’s rapid growth turns model serving — not just training — into a bottleneck. - The bet is Fractile’s SRAM-heavy design, which says it can run frontier-model inference up to 25x faster and at one-tenth the cost. - Anthropic is already spread across Nvidia, Google TPUs, and AWS Trainium — so a Fractile deal would widen a clear supplier-diversification push.
AI chips are splitting into two businesses now. One is training giant models. The other is serving them to millions of users without the bill exploding. That second job — inference — is where Anthropic seems to be looking for help. The company has reportedly held early talks to buy inference chips from London startup Fractile, a sign that the pressure point for frontier AI is shifting from building models to running them cheaply at scale. ### Why is inference suddenly the problem? Training gets the headlines because it uses absurd amounts of compute in short bursts. But once a model is live, every prompt, every long context window, and every coding session turns into ongoing infrastructure cost. Anthropic itself said on April 6 that customer demand had accelerated so fast that its run-rate revenue is rising just as fast. ### What is Fractile actually making? Fractile is building hardware specifically for frontier-model inference. Its pitch is blunt: run advanced models up to 25x faster and at one-tenth the cost. The company’s design focuses heavily on SRAM, which is faster and closer to compute than the DRAM systems that often become the choke point when large models are queried over and over. Inference is often a memory traffic problem disguised as a compute problem. Large models need weights and cached context moved around constantly, and that movement is expensive in both time and money. Memory prices have also become their own headache — one recent industry snapshot noted DRAM prices had jumped roughly 7x over the prior year as hyperscalers raced to expand AI infrastructure. ### Why would Anthropic look beyond Nvidia? Basically, resilience and economics. Anthropic said in April that it already runs Claude across AWS Trainium, Google TPUs, and Nvidia GPUs so it can match workloads to the best hardware. If Fractile gets added, that would look less like a one-off experiment and more like a deliberate strategy: diversify suppliers, fit chips to workloads, and avoid letting one hardware bottleneck cap product growth. ### Is this a near-term deployment? Probably not at scale. Reporting around the talks says they are early, and Fractile’s chips are not expected to be commercially ready until around 2027. So this is better read as a directional signal than an immediate procurement