MiniMax M2.7 open‑sourced
NVIDIA announced the open-source release of MiniMax M2.7 and highlighted GPU-accelerated tooling for the model on Hugging Face. (x.com) Fireworks AI said it offers Day‑0 commercial support for MiniMax M2.7 with very large context windows — enabling agentic tasks like code security and log analysis. (x.com)
MiniMax has released M2.7 as open weights, giving developers a new large language model they can download, run, and fine-tune themselves. (huggingface.co) M2.7 is a mixture-of-experts model, a design that acts like a team of specialists instead of one giant brain. NVIDIA said the model has 230 billion total parameters, activates about 10 billion per token, uses 256 experts, and supports a 200,000-token context window. (developer.nvidia.com) MiniMax said M2.7 was built for long, tool-using workflows such as log analysis, bug hunting, refactoring, code security, and office-document editing. The Hugging Face model card says the company also published the weights under a modified MIT license. (huggingface.co) The release lands as more model makers push “agentic” systems that can plan, call tools, and keep working across long tasks instead of answering one prompt at a time. NVIDIA’s April 11 post framed M2.7 around those use cases and tied it to open-source serving stacks such as vLLM, SGLang, NeMo AutoModel, and NemoClaw. (developer.nvidia.com) Open weights change who can use a model and how. Instead of waiting for a hosted application programming interface, developers can run M2.7 on their own infrastructure, tune it on private data, and swap it into existing open-source inference systems. (github.com, developer.nvidia.com) MiniMax is pitching M2.7 on benchmark scores tied to software and workplace tasks. Its model card lists 56.22% on SWE-Pro, 57.0% on Terminal Bench 2, 39.8% on NL2Repo, 46.3% on Toolathon, and 1495 Elo on GDPval-AA. (huggingface.co) Some of the biggest claims are harder to verify independently. MiniMax said an internal version of M2.7 improved a programming scaffold over more than 100 rounds and raised performance by 30%, but that result comes from the company’s own description of its training process. (github.com) NVIDIA’s role in the launch was less about inventing the model than making it easier to run fast on graphics processors. Its post says the company worked with the open-source community on kernels for vLLM and SGLang, including query-key root mean square normalization and floating-point 8 mixture-of-experts optimizations aimed at faster inference. (developer.nvidia.com) Fireworks AI moved quickly to commercial hosting. Its model page lists serverless and dedicated deployment support, fine-tuning, function calling, a 196.6 thousand-token context length, and pricing of $0.30 per million input tokens, $0.06 per million cached input tokens, and $1.20 per million output tokens. (fireworks.ai) That combination — open weights from MiniMax, acceleration work from NVIDIA, and day-one hosted access from Fireworks — means M2.7 is arriving as both a downloadable model and a product developers can put into production immediately. (huggingface.co, developer.nvidia.com, fireworks.ai)