Hybrid still the sub‑ms standard

Published March 18, 2026 by The Daily Scout

Recent industry analysis reasserts that cloud‑only architectures don't meet sub‑millisecond trading needs—on‑premises or colocation remains essential for execution while cloud handles analytics, training, and burst capacity. The debate has shifted to orchestration: how to make hybrid deployments predictable and auditable at sub‑ms scales. (technicalways.com, infoq.com)

Why it matters

28Stone’s benchmark with Google Cloud reported tick‑to‑trade best‑case latencies below 2 microseconds and Google’s follow‑up noted P99 under 50 microseconds on C3 instance types in controlled tests published September 22, 2025. (28stone.com) Multiple market observers still document that production HFT stacks retain on‑premises colocation and FPGA front‑ends for nanosecond‑class advantages, with industry analysis citing FPGAs as the standard for cutting latency from microseconds to tens of nanoseconds. (globaltrading.net) At QCon London (Mar 16, 2026), JPMorgan Chase product strategists Luis Henrique Albinati Junior and Surabhi Mahajan framed multi‑cloud as a product problem requiring capability mapping, demand governance, and defined users rather than a purely engineering fix. (qconlondon.com) Orchestration gaps driving the debate are concrete: Kubernetes networking and default schedulers were not designed for deterministic, sub‑millisecond traffic, prompting work on TSN‑aware plugins, latency‑aware scheduler extensions and projects like Kinitos that implement network‑ and latency‑aware scheduling for Kubernetes. (latitude.sh) Kernel‑bypass stacks (DPDK, AF_XDP, RDMA) and FPGA acceleration remain the operational levers for microsecond and sub‑microsecond execution—kernel bypass reduces kernel overhead to microsecond scales and community FPGA projects demonstrate end‑to‑end FPGA trade paths reporting <5 µs processing in lab reproductions. (databento.com) Efforts to make hybrid deployments auditable and deterministic are surfacing in research and tooling: deterministic orchestration proposals (e.g., ORCH) and industry proofs‑of‑concept emphasize reproducible, verifiable execution graphs and latency‑aware placement as the next architectural controls for bridging cloud elasticity with exchange‑grade predictability. (arxiv.org)

Key numbers

(technicalways.com, infoq.com) 28Stone’s benchmark with Google Cloud reported tick‑to‑trade best‑case latencies below 2 microseconds and Google’s follow‑up noted P99 under 50 microseconds on C3 instance types in controlled tests published September 22, 2025.

Sources

Quick answers

What happened in Hybrid still the sub‑ms standard?

Recent industry analysis reasserts that cloud‑only architectures don't meet sub‑millisecond trading needs—on‑premises or colocation remains essential for execution while cloud handles analytics, training, and burst capacity. The debate has shifted to orchestration: how to make hybrid deployments predictable and auditable at sub‑ms scales. (technicalways.com, infoq.com)

Why does Hybrid still the sub‑ms standard matter?

28Stone’s benchmark with Google Cloud reported tick‑to‑trade best‑case latencies below 2 microseconds and Google’s follow‑up noted P99 under 50 microseconds on C3 instance types in controlled tests published September 22, 2025. (28stone.com) Multiple market observers still document that production HFT stacks retain on‑premises colocation and FPGA front‑ends for nanosecond‑class advantages, with industry analysis citing FPGAs as the standard for cutting latency from microseconds to tens of nanoseconds. (globaltrading.net) At QCon London (Mar 16, 2026), JPMorgan Chase product strategists Luis Henrique Albinati Junior and Surabhi Mahajan framed multi‑cloud as a product problem requiring capability mapping, demand governance, and defined users rather than a purely engineering fix. (qconlondon.com) Orchestration gaps driving the debate are concrete: Kubernetes networking and default schedulers were not designed for deterministic, sub‑millisecond traffic, prompting work on TSN‑aware plugins, latency‑aware scheduler extensions and projects like Kinitos that implement network‑ and latency‑aware scheduling for Kubernetes. (latitude.sh) Kernel‑bypass stacks (DPDK, AF_XDP, RDMA) and FPGA acceleration remain the operational levers for microsecond and sub‑microsecond execution—kernel bypass reduces kernel overhead to microsecond scales and community FPGA projects demonstrate end‑to‑end FPGA trade paths reporting <5 µs processing in lab reproductions. (databento.com) Efforts to make hybrid deployments auditable and deterministic are surfacing in research and tooling: deterministic orchestration proposals (e.g., ORCH) and industry proofs‑of‑concept emphasize reproducible, verifiable execution graphs and latency‑aware placement as the next architectural controls for bridging cloud elasticity with exchange‑grade predictability. (arxiv.org)

Hybrid still the sub‑ms standard

What happened

Why it matters

Key numbers

Sources

Quick answers

What happened in Hybrid still the sub‑ms standard?

Why does Hybrid still the sub‑ms standard matter?

Get your own daily briefing