M5 Max runs 122B Qwen3.5 — power & throttling notes

Developer Ivan Fioravanti ran a 122B Qwen3.5 model on an M5 Max with 128GB, showing large‑model handling beyond memory limits and reporting sustained power up to ~200W with high‑power temps ≈78°C posted posted. He also shared a rumor of an M6 MacBook Pro mold refresh by year‑end, keeping hardware cadence rumors alive noted.

Independent tests surfaced a concrete quantized footprint for Qwen3.5‑122B on Apple silicon: the 4‑bit MXFP4 variant was measured at ~69.6 GB, enabling a 128 GB M5 Max to host the model’s weights and KV cache with headroom for context, according to a Hardware‑Corner summary of community runs. (hardware-corner.net) Qwen3.5‑122B‑A10B is architected as a 122‑billion‑parameter mixture‑of‑experts model that activates roughly 10 billion parameters at inference, uses 256 experts, and exposes a native 262k token context window in official specs and model pages. (apxml.com) The M5 Max platform’s combination of up to 128 GB unified memory and a reported ~614 GB/s memory bandwidth is a material enabler for these local LLM runs, while Apple’s own M5 announcement emphasizes the new Fusion Architecture and AI‑focused design rather than publishing sustained system‑level wattage figures. (hardware-corner.net) Independent review coverage is already flagging real‑world power/thermal behavior: Notebookcheck documented inconsistent sustained performance and potential throttling on the M5 Max 14‑inch chassis, and Digital Trends noted the smaller MacBook Pro can limit peak chip throughput under heavy loads. (notebookcheck.net) Frameworks and quant formats matter: community MXFP4/4‑bit builds on Hugging Face and deployment recipes in vLLM/MLX are the practical paths testers used to shrink Qwen3.5’s memory needs for local inference, with published guides and GGUF packages showing quantized variants circulate within weeks of the model’s release. (huggingface.co) Market‑level cadence rumors still point to a bigger generational update beyond M5: coverage from MacObserver, 9to5Mac and Macworld consolidates leaks that an M6 MacBook Pro could pair 2nm M6 silicon with an OLED touchscreen and a redesigned chassis toward late‑2026 or early‑2027. (macobserver.com)

M5 Max runs 122B Qwen3.5 — power & throttling notes

Get your own daily briefing