AMD Pushes Deeper into AI with New Chips, Future GPU Plans
AMD is expanding its AI hardware lineup, launching the Ryzen AI 400 and PRO 400 series desktop processors with integrated neural processing. The move comes as startup TinyCorp is publicly pushing AMD for a future RDNA 5 GPU with 96GB of VRAM for AI workloads, while the first Linux patches for AMD's next-gen Zen 6 CPUs have already been released.
The new desktop processors feature AMD's XDNA 2 neural processing unit (NPU) architecture, rated for up to 50 TOPS (trillion operations per second) of INT8 performance. This NPU is integrated alongside Zen 5 CPU cores and RDNA 3.5 graphics, forming a single-chip solution designed for AI-accelerated workloads in both consumer and enterprise environments. This level of on-chip AI performance is aimed at Microsoft's Copilot+ PC requirements, which specifies a minimum of 40 TOPS from an NPU. The integrated design allows AI tasks to be distributed between the NPU, CPU, and GPU for optimal efficiency, a key consideration in thermally constrained systems like mini PCs where sustained performance is critical. The push from startup TinyCorp, founded by George Hotz, stems from a desire to create a competitive, open-source alternative to Nvidia's CUDA ecosystem for AI development. Hotz has been publicly vocal about software and driver issues with AMD's consumer GPUs, arguing that open-sourcing the firmware for schedulers and memory management is necessary to optimize performance for neural networks. TinyCorp's goal is to commoditize AI compute, believing customers are overpaying for Nvidia hardware. A GPU with 96GB of VRAM would be a significant step for running large language models (LLMs) on a single card. For reference, a 70-billion parameter model can require over 140GB of VRAM at half-precision (FP16/BF16), and even with 4-bit quantization, it still needs around 35GB for model weights alone. The total memory footprint grows substantially when accounting for the KV cache, which scales with batch size and context length. In the data center, AMD's flagship AI accelerator is the Instinct MI300X, which features 192GB of HBM3 memory. This gives it a significant memory capacity advantage over competitors like Nvidia's H200, which has 141GB of HBM3e. While the H200 often shows higher throughput and lower latency due to its mature software stack, the MI300X's larger memory is crucial for memory-bound models. The early Linux patches for Zen 6, codenamed "Morpheus" for cores and "Venice" for EPYC server processors, signal that the core architecture is stabilizing ahead of a projected 2026-2027 launch. These initial patches focus on enabling basic kernel support and revealing architectural details, such as support for up to 16 channels of memory on next-gen EPYC platforms. Zen 6 is expected to be a ground-up redesign manufactured on a 2nm process, moving to a wider, throughput-oriented design. Rumors suggest Zen 6-based desktop CPUs, codenamed "Medusa," could feature Core Complex Dies (CCDs) with up to 12 cores each, potentially enabling up to 24-core consumer chips.