Apple Silicon: native MoE fine‑tuning
mlx‑tune was updated to run natively on Apple Silicon and can fine‑tune MoE models like Qwen3.5‑35B‑A3B and Qwen3‑Embedding while auto‑resolving MoE paths — smoothing large‑model dev on Mac hardware (x.com). That makes on‑device experimentation with gating and embeddings far more accessible to Mac‑based ML engineers (x.com).
mlx‑tune’s PyPI package published a v0.4.x build on Mar 26, 2026, marking the package’s latest release and packaging for Apple‑Silicon MLX usage. (pypi.org) The project’s GitHub repo includes explicit Qwen3 examples — e.g., examples/28_qwen3_embedding_finetuning.py for Qwen3‑Embedding and separate examples for Qwen3‑TTS and Qwen3‑ASR — indicating first‑class support for embedding, speech, and TTS workflows. (github.com) Alibaba’s Qwen3.5 family was published in February 2026 and the Qwen3.5‑35B‑A3B MoE variant was released on Feb 24, 2026 and is available on hubs such as Hugging Face. (github.com) Model cards and third‑party ports report Qwen3.5‑35B‑A3B as a sparse MoE with ≈35 billion total parameters and roughly 3 billion active parameters per inference pass in typical configurations. (huggingface.co) Apple’s MLX documentation and MLX‑LM examples explicitly include MoE benchmarking and describe MLX’s unified‑memory mapping to the Neural Engine/GPU, mechanics that MLX‑backed tooling (including mlx‑tune) leverages to host larger and sparsely activated models on M‑series machines. (machinelearning.apple.com) The mlx‑tune author positions the project as an Unsloth‑compatible “bridge” for prototyping locally on Macs and then scaling to cloud Unsloth workflows, and the repo’s recent commits and CI activity through late March 2026 reflect active maintenance. (github.com) Independent how‑tos and benchmarks show MLX‑optimized Qwen builds yielding multi‑fold throughput gains on M‑series hardware and document LoRA workflows that can finish adapter training in under ~30 minutes on 16GB M2/M3 machines, demonstrating the practical performance context for on‑device MoE experimentation. (dev.to)