AI Storage + Vera Racks
Vendors are showcasing AI storage platforms built for the chaotic access patterns of agentic RAG workloads, aimed at long‑history retrievals and heavy concurrent reads announced. NVIDIA’s new Vera CPU and MGX rack references (up to 22,500 cores, 400TB memory) are shipping in OEM systems to power massive concurrent retrievals and in‑memory contexts for agents reported, reported.
AIC is demoing a DPU-accelerated, high-density NVMe platform called the F2026‑G5 JBOF at GTC that it describes as enabling CMX‑aligned shared‑flash tiers for large‑scale inference. (prnewswire.com) The AIC release says those systems support GPU‑initiated storage access and NVMe‑over‑Fabric designs to move data from flash directly to accelerators and reduce host CPU I/O overhead. (morningstar.com) NVIDIA’s new Vera CPU is an 88‑core Arm processor Nvidia announced as purpose‑built for agentic AI and is reported as shipping in OEM systems from Dell, HPE and Lenovo. (nvidianews.nvidia.com) NVIDIA published a Vera MGX rack reference that nests up to 256 liquid‑cooled Vera chips alongside 64 BlueField‑4 DPUs, claiming rack configurations that exceed 22,500 cores and roughly 400 TB of pooled memory for dense concurrent execution. (theregister.com) NVIDIA’s MGX modular reference architecture is cited as the path partners will use to mix Vera, Grace, GPUs and networking in OEM builds, enabling vendors to ship tailored rack configurations rather than single fixed SKUs. (nvidia.com) Vendor and press coverage highlights rack‑scale pooling (shared NVMe tiers, DPUs, and pooled memory) as the design pattern enabling long‑history RAG and tens‑of‑thousands concurrent agent instances per POD, with vendors positioning these stacks for inference factories and large‑context retrievals. (prnewswire.com)