MiniMax M2.7 released

MiniMax published an open‑source model, M2.7, which the vendor says achieves state‑of‑the‑art scores on developer and terminal benchmarks. NVIDIA highlighted availability on Hugging Face with GPU acceleration via endpoints like NemoClaw and encouraged developer experimentation. (x.com)

MiniMax has released M2.7 as open weights, putting a new large language model for coding, terminal work, and office tasks on Hugging Face as of April 11. (huggingface.co) (docs.api.nvidia.com) The model card describes M2.7 as a sparse mixture-of-experts system with 230 billion total parameters, 10 billion active per token, 256 experts, and a 200,000-token context window. NVIDIA’s model page lists the same April 11, 2026 release date and says the model is available for commercial and non-commercial use. (developer.nvidia.com) (docs.api.nvidia.com) A mixture-of-experts model works like a team of specialists: the system routes each token to a small subset of experts instead of using the full network every time. NVIDIA said M2.7 activates 8 experts per token, a design meant to keep inference costs lower than a dense model of the same headline size. (developer.nvidia.com) MiniMax says M2.7 scored 56.22% on SWE-Pro and 57.0% on Terminal Bench 2, two benchmarks used to test software engineering and command-line task performance. The company also reported 76.5 on SWE Multilingual, 52.7 on Multi SWE Bench, and 55.6% on VIBE-Pro. (huggingface.co) (github.com) (minimax.io) MiniMax also pitched M2.7 as an office-work model, reporting a 1495 Elo score on GDPval-AA and saying it improved multi-round editing in Word, Excel, and PowerPoint files. On Toolathon, the company reported 46.3% accuracy, plus 97% skill adherence across more than 40 complex skills in its MM Claw evaluation. (huggingface.co) (minimax.io) The company’s central claim is not just raw benchmark gains but a training loop it calls “self-evolution.” MiniMax said an internal M2.7 variant updated its own memory, built skills for reinforcement learning experiments, and improved a programming scaffold over more than 100 rounds, producing a 30% performance gain in that internal setup. (github.com) (minimax.io) NVIDIA used the launch to push deployment options around its own stack. Its April 11 blog said M2.7 can run through NVIDIA Inference Microservices, or NIM, and with NemoClaw and OpenShell for always-on agent setups on graphics processing unit cloud systems. (developer.nvidia.com) (docs.api.nvidia.com) NVIDIA also said it worked with the open-source serving projects vLLM and SGLang on kernels tuned for large mixture-of-experts models, including query-key root mean square normalization and floating point 8 mixture-of-experts kernels. Those are the low-level optimizations that determine whether a model this large is practical to serve at speed on graphics processing units. (developer.nvidia.com) Most of the headline performance numbers come from MiniMax itself, and the training data remains undisclosed on NVIDIA’s model page. What is independently verifiable today is narrower: the weights are public, the architecture details are posted, and the model is already being packaged across Hugging Face, GitHub, and NVIDIA’s inference catalog. (docs.api.nvidia.com) (huggingface.co) (github.com) That leaves the next test to developers rather than launch posts. MiniMax has put M2.7 where anyone with enough graphics processing unit memory can inspect it, run it, and see whether the benchmark claims hold up in real code and terminal sessions. (huggingface.co) (developer.nvidia.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.