Uber tests Graviton4 + Trainium3

Uber moved its core matching system to AWS Graviton4 processors and piloted Trainium3 for models that predict ETAs and personalise rides, reporting faster matches for millions of daily users (x.com). The example shows combining cost-efficient ARM servers with dedicated AI accelerators to scale latency-sensitive ML in production (x.com).

Most people think a ride-hailing app is waiting on maps. A lot of the delay is actually the app deciding which driver should get which rider, and Uber says it now does more of that work on Amazon Web Services Graviton4 chips that are tuned for fast, lower-cost cloud computing. (zawya.com) That matching job happens inside what Uber calls Trip Serving Zones, which are small operating areas that constantly rebalance supply and demand as riders open the app, drivers move, and delivery orders arrive. Uber says those zones now run on Graviton4 to help match customers with drivers in milliseconds and to handle demand spikes for rides and deliveries. (zawya.com) A central processing unit is the general-purpose brain of a server, like the main kitchen in a restaurant that can cook almost anything. Graviton4 is Amazon’s latest Arm-based central processing unit, and Amazon says it is built to improve price performance for the broad server work that companies run all day. (aws.amazon.com) Uber’s second test is a different kind of chip for a different job. It is piloting Trainium3 to train the machine learning models that predict estimated arrival times and personalize what riders and eaters see in the app. (zawya.com) Training a model is the expensive part where a system studies huge piles of past trips and learns patterns, like a dispatcher reviewing years of traffic logs before making tomorrow’s schedule. Amazon says Trainium3 is a purpose-built artificial intelligence accelerator for training and inference, and its Trn3 systems are aimed at large-scale model work rather than everyday app serving. (aws.amazon.com, aws.amazon.com) Uber already uses machine learning in places riders notice without thinking about it. Its engineering team has said estimated arrival times affect fares, pickup estimates, rider-driver matching, delivery planning, home-feed ranking, and fraud detection across the company’s products. (uber.com, uber.com) That is why this deal splits the work in two instead of putting everything on one chip. Graviton4 handles the live marketplace decisions that have to happen in milliseconds, while Trainium3 is being tested for the heavier offline model-training work that improves those decisions later. (zawya.com, aws.amazon.com) Uber has been building around Amazon Web Services for years, but this move shows a deeper bet on Amazon’s own silicon instead of standard cloud processors alone. TechCrunch reported on April 7, 2026 that Uber was expanding its Amazon Web Services contract to run more ride-sharing features on Graviton and begin a Trainium3 trial. (techcrunch.com) Amazon is pushing that same custom-chip play across its cloud business. Amazon says Trainium3 delivers up to 3 times the performance of Trainium2 and more than 5 times higher output tokens per megawatt on Amazon Bedrock, which is its managed artificial intelligence platform. (aws.amazon.com) Uber’s part of the story is less about flashy chatbots than about shaving tiny delays off a system used millions of times a day. When a company at Uber’s scale says milliseconds matter, it usually means the hardware choice has become part of the product, not just part of the back room. (newsbreak.com, zawya.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.