Keep tier 0 on‑prem, tier workloads

- On May 16, 2026, recent exchange, cloud and data-center reporting showed firms are sorting trading infrastructure by latency tier rather than cloud ideology. - Nasdaq’s colocation pitch centers on “proximity to the speed and liquidity” of its U.S. markets, while AWS says trading workloads still face low-jitter constraints. - On May 15, 2026, E&E News reported PJM power prices surged 76%; Data Center Knowledge’s latest earnings coverage tracked power and deployment constraints.

AWS, Nasdaq and data-center industry reporting are converging on a practical rule for trading infrastructure in 2026: keep the exchange-facing hot path close to the venue, and move less time-sensitive work where capacity is cheaper or easier to scale. AWS says trading workloads still demand low latency and low jitter, while Nasdaq continues to market colocation as a way to stay physically near its U.S. markets. At the same time, recent reporting from E&E News and Data Center Knowledge has focused on power costs, cooling and deployment bottlenecks in AI-heavy infrastructure markets. Those facts have pushed the cloud-versus-on-prem debate toward a workload-by-workload decision, not a single destination for everything. ### Which systems belong in the “do not move” tier? Nasdaq says its colocation service lets customers place servers “within the Nasdaq primary data center,” offering proximity to the liquidity of its U.S. markets. That makes the first cut straightforward: market-data ingestion, order entry, smart order routing and exchange-facing risk controls are the systems most exposed to physical distance and network variance. AWS said in a July 2025 post on digital-asset exchange design that a centralized exchange hot path includes order entry, balance checks, matching, acknowledgements and publication of market events. (aws.amazon.com) The same post said tick-to-trade latency is a business-critical metric for market makers. Those functions map closely to what many firms would treat as Tier 0 — the path where extra hops, virtualization overhead or jitter can directly affect execution timing. (nasdaq.com) Altera and Cisco market FPGA- and SmartNIC-based adapters for high-frequency and ultra-low-latency environments, including hardware offload for feed handling, timestamping, order processing and risk checks. Databento’s kernel-bypass overview lists ef_vi, OpenOnload, TCPDirect, VMA and DPDK among the techniques used to avoid the kernel network stack and cut latency. If a service depends on those features, it is usually a sign that the workload belongs in the local deterministic tier, not on shared general-purpose infrastructure. (aws.amazon.com) ### Why doesn’t cloud performance settle the argument? AWS said in a November 2023 capital-markets post that exchanges can use cloud-native patterns to reach latency profiles comparable to on-premises systems for some use cases. The same post also said trading workloads present unique challenges, including low-latency response, low-jitter network performance and fairness. That is not a blanket claim that every exchange-facing service should move unchanged into a distant region. (altera.com) AWS said in an April 2026 follow-up on tick-to-trade design that customers can use bare metal and specialized networking patterns to optimize performance. The point in the vendor material is narrower than “cloud is now always fast enough”: architecture, placement and workload shape still determine whether a service can tolerate the remaining variance. (aws.amazon.com) ### What moves out of the hot path first? Google Cloud’s hybrid and multicloud networking guidance describes architectures where workloads run partly on premises and partly in cloud environments. In trading shops, that usually fits analytics, model training, simulation, backtesting, surveillance, reporting and batch reconciliation better than exchange-edge services do, because those jobs care more about scale and elasticity than single-digit microseconds. (aws.amazon.com) Data Center Knowledge reported this week that neocloud earnings commentary has shifted from GPU supply alone toward power, networking, cooling and deployment speed. That reporting matters because it describes the economics around where large compute-heavy workloads can be placed, especially AI training and other bursty jobs that do not need to sit beside an exchange matching engine. ### How much are power constraints shaping the placement decision? (docs.cloud.google.com) E&E News reported on May 15 that PJM power prices surged 76%, tying the increase to data-center demand. A separate March E&E News report said electricity costs in PJM climbed 56% in 2025 and warned that prices could keep rising if supply problems were not addressed. Those costs do not decide latency architecture by themselves, but they do affect where firms can add new compute capacity. The practical result is a tiered policy. (datacenterknowledge.com) Tier 0 stays on-premises or in colocation near the venue when determinism, kernel bypass, FPGA offload or exchange proximity are part of the service design. Higher-latency tiers — analytics, training, simulation and batch processing — can move cloud-first, with placement decisions increasingly shaped by available power, cooling and deployment timelines as much as by raw server prices. (nasdaq.com) (eenews.net)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.