AMD pushes MI350 platform access
- AMD expanded access to its MI350-series accelerators on May 15 by pairing developer cloud trials with public MI355X infrastructure partnerships and software enablement. - Zyphra said this week it will offer 15 megawatts of AMD Instinct MI355X capacity, giving AMD a named large-scale reference deployment. - AMD’s next access points include Oracle Cloud Infrastructure MI355X instances and ROCm-supported vLLM and SGLang deployments already documented publicly.
AMD is widening the number of ways developers and cloud customers can get onto its latest Instinct MI350-series accelerators, shifting the fight over AI chips from launch-stage benchmarks to day-to-day access. The company already has an AMD Developer Cloud program, public cloud availability for MI355X systems, and a growing set of partner deployments built around the flagship MI355X. This month, Zyphra added a new public reference point by announcing 15 megawatts of MI355X capacity through Zyphra Cloud. AMD has also been publishing ROCm documentation and benchmark material around open-source inference stacks such as vLLM and SGLang, which are central to production AI serving. ### Where is AMD actually making MI355X available? AMD said in a June 12, 2025 blog post that its AMD Developer Cloud was created to give developers “instant” access to AMD Instinct hardware with pre-configured environments and credits, though that launch was framed around MI300 GPUs rather than MI355X specifically. The company presented the service as a lower-friction on-ramp for developers testing ROCm, models and inference software without owning hardware. (amd.com) Oracle Cloud Infrastructure has already moved MI355X into general availability. Oracle said in a December 10, 2025 post that its BM.GPU.MI355X.8 shape was generally available from October 2025, giving customers a commercial cloud path to the chip outside AMD’s own developer program. Phoronix reported in September 2025 that TensorWave was the first public cloud provider with MI355X availability, adding another route for outside developers to get hands-on access. (amd.com) That matters because cloud access, not just server shipments, determines whether software teams can test ports, benchmark models and fix deployment issues on the latest hardware. ### What is Zyphra adding that was not already there? (blogs.oracle.com) Zyphra said on May 4 that it launched Zyphra Cloud as a full-stack AI platform powered by AMD Instinct MI355X GPUs running on TensorWave infrastructure. The company said the service starts with Zyphra Inference, a serverless inference product for open-weight models including DeepSeek V3.2, Kimi K2.6 and GLM 5.1. (phoronix.com) Zyphra then said on May 11 that it was making 15 megawatts of MI355X capacity available through Zyphra Cloud. The announcement gave AMD a large named deployment tied not to a benchmark chart but to capacity that outside customers can buy for inference and broader AI workloads. Negin Oliver, AMD’s corporate vice president for AI business development, said in Zyphra’s May 4 release that “optimized AI software combined with our accelerator architecture” can deliver production inference performance on open-weight models. (prnewswire.com) Jeff Tatarchuk, TensorWave’s chief growth officer, said the infrastructure was built to let customers ship “production-ready AI” on MI355X systems. (tmcnet.com) ### Why do vLLM and SGLang show up in this story? AMD’s recent MI355X push is tied closely to open inference software. In a December 8, 2025 ROCm blog post, AMD published MI355X inference results on vLLM across models including DeepSeek-R1, GPT-OSS-120B, Qwen3-235B and Llama-3.3-70B, describing the work as part of its open software ecosystem. SGLang is getting similar treatment in AMD’s documentation stack. (prnewswire.com) ROCm documentation published on April 21, 2026 describes distributed SGLang inference on an AMD Instinct MI355X cluster using MoRI, AMD’s communication backend for inter-node collective operations. The software support is also visible in third-party project documentation. vLLM’s recipes documentation lists MI355X among supported AMD GPUs for ROCm wheels, and SGLang’s hardware documentation includes AMD GPU setup guidance. (rocm.blogs.amd.com) Those references show that MI355X access is being paired with documentation and packaging aimed at developers already using mainstream open-source serving stacks. ### What does the hardware itself offer to those users? (rocm.docs.amd.com) AMD says the MI355X is built on its fourth-generation CDNA architecture and includes 288GB of HBM3E memory with 8TB/s of bandwidth. The company positions the chip for both training and inference, with support for lower-precision formats including FP8, FP6 and FP4. AMD’s MI350-series product page says the family is designed for “frictionless adoption,” pointing to drop-in compatibility, Kubernetes support through the AMD GPU Operator and day-one framework support through ROCm. (docs.vllm.ai) Those claims are AMD’s own, but they show how the company is presenting MI355X as a deployable platform rather than only as a raw accelerator. ### What should readers watch next? Oracle’s MI355X instances are already live, Zyphra’s 15-megawatt capacity was announced this week, and AMD’s ROCm materials for vLLM and SGLang continue to expand. (amd.com) The next concrete markers will be additional public cloud rollouts, more named customer deployments on MI355X, and new ROCm releases tied to production inference frameworks. (blogs.oracle.com) (amd.com)