Zero Latency rolls out Blackwell grid
- Zero Latency, which now brands itself as 0.lat, rolled out a U.S.-wide edge AI network called Zerogrid using NVIDIA Blackwell GPUs and Red Hat AI. - The setup routes inference jobs to nearby edge datacenters based on latency, data locality, and capacity instead of sending everything to big centralized clouds. - It matters because AI demand is shifting from training models centrally to serving real-time outputs near factories, transactions, and other data-heavy sites.
AI infrastructure is moving into a new phase. Training giant models in giant datacenters is still the glamorous part, but the harder commercial problem is serving answers fast enough, close enough, and cheaply enough for real systems. That gap is what Zero Latency — now branding itself as 0.lat — is trying to attack. On May 11 at Red Hat Summit, the company said it had standardized its U.S. network on Red Hat AI Factory with NVIDIA and is running Zerogrid, a distributed inference network built around NVIDIA Blackwell GPUs. ### What actually launched? This is not a new foundation model and not a single new datacenter. It is a distributed inference grid — Zerogrid — that spreads GPU capacity across decentralized edge sites in the U.S. and then routes AI jobs to the location that best fits the request. The company had announced a closed beta for Zerogrid on May 7, then followed with the Red Hat and NVIDIA deployment news on May 11. (businesswire.com) ### Why put Blackwell at the edge? Because the bottleneck is often physics. If an application has to send data to a faraway cloud region, wait for inference, and pull the result back, delay stacks up fast. Zero Latency is pitching Blackwell-powered edge nodes as a way to cut that “latency tax” for industrial automation and real-time transactions — the kinds of workloads where a slow answer is basically a wrong answer. (markets.businessinsider.com) ### What does Zerogrid actually do? The core idea is workload routing. Zerogrid treats each inference request as something to dispatch based on latency, geography, data locality, and available capacity. Basically, it is trying to make AI inference behave less like a fixed server endpoint and more like traffic control — sending each job to the best nearby GPU pool instead of the default cloud region. That matters when data is created in factories, industrial hubs, or local transaction systems and moving it elsewhere adds cost, delay, or compliance friction. (redhat.com) ### Why is Red Hat in this? Because once you spread compute across many sites, orchestration becomes the real product. Red Hat said 0.lat adopted Red Hat AI Factory with NVIDIA as the enterprise Kubernetes foundation for its U.S.-wide network. In plain English, that means the company is using Red Hat’s stack to manage, schedule, and operate GPU infrastructure consistently across distributed locations instead of hand-building every site. (markets.businessinsider.com) ### Why does Blackwell matter here? Blackwell is NVIDIA’s current GPU architecture, and NVIDIA is pushing it as a major step forward for AI performance and efficiency. The company’s technical brief says Blackwell systems can deliver major gains in throughput and energy efficiency over the prior generation. For an edge network, that is the whole game — more inference per watt, in smaller footprints, closer to where the data starts. (businesswire.com) ### Is this just another cloud by another name? Not quite. Zero Latency calls the model a “neocloud,” which is startup language, but the distinction is real enough. Traditional cloud AI concentrates compute in a few giant regions. This model aggregates many smaller edge datacenters and sells them as one inference fabric. Turns out that is a better fit for bursty, location-sensitive workloads than pretending every request should travel to the same handful of hyperscale campuses. (resources.nvidia.com) ### What changed for NVIDIA? The interesting shift is not just one customer win. It is where Blackwell is showing up. NVIDIA has dominated the story around centralized AI factories, but this rollout adds evidence that Blackwell demand is spreading into decentralized inference infrastructure too. That broadens the market from “who is training frontier models?” to “who needs real-time AI near the edge?” (markets.financialcontent.com) ### Bottom line This launch is really a bet that inference, not training, is becoming the next infrastructure land grab. If 0.lat can make distributed GPU capacity feel as easy to consume as a cloud endpoint, then Blackwell is not just a datacenter chip anymore — it becomes part of the physical fabric for real-time AI. (businesswire.com) (finance.yahoo.com)