MiniMax 230B on NVIDIA

MiniMax M2.7, a 230‑billion‑parameter AI model, was released onto NVIDIA infrastructure and coverage notes that NVIDIA’s NemoClaw reference stack provides a one‑click deployment path for agentic workloads. The story links a large model rollout to available NVIDIA tooling for quicker deployment on their cloud and hardware. (news.mkncrypto.com)

Artificial intelligence models are giant prediction engines, and MiniMax’s latest one is now packaged to run on NVIDIA’s cloud and software stack. (developer.nvidia.com) NVIDIA said on April 11, 2026 that MiniMax M2.7 is available through its stack, including NVIDIA Inference Microservices and a one-click NVIDIA NemoClaw setup on the NVIDIA Brev cloud platform. (developer.nvidia.com) MiniMax describes M2.7 as a sparse mixture-of-experts model, which works like a large team where only a few specialists answer each request instead of waking up the whole staff. NVIDIA and MiniMax list 230 billion total parameters, 10 billion active parameters per token, 256 experts, and a 200,000-token context window. (developer.nvidia.com, build.nvidia.com) Agents are software systems that can call tools, search, write code, and keep working across many steps, and NVIDIA is pitching M2.7 for those longer jobs. NVIDIA said NemoClaw installs OpenClaw and the OpenShell runtime with a single command, adding sandboxing and policy controls for always-on assistants. (developer.nvidia.com, build.nvidia.com) MiniMax launched M2.7 on March 18, 2026 and said it was built for software engineering, office work, and multi-tool workflows. The company said the model scored 56.22% on SWE-Pro, 57.0% on Terminal Bench 2, and 1495 Elo on GDPval-AA. (minimax.io, minimax.io) NVIDIA’s pitch is less about a new model architecture than about shortening the setup work around deployment. Its Brev documentation defines Launchables as one-click packages that bundle GPU hardware, software, and code into a reproducible environment. (docs.nvidia.com, developer.nvidia.com) NVIDIA also said it worked with the open-source serving projects vLLM and SGLang to add kernels tuned for large mixture-of-experts models like M2.7. The company said those optimizations include a query-key normalization kernel and a floating-point 8 mixture-of-experts kernel based on TensorRT-LLM. (developer.nvidia.com) MiniMax is also pushing a “self-evolution” narrative around the model. In its GitHub materials, the company said an internal version of M2.7 optimized a programming scaffold over more than 100 rounds and improved performance by 30%. (github.com, minimax.io) The immediate change for developers is practical: M2.7 can now be tried as an NVIDIA-hosted endpoint, a deployable NVIDIA Inference Microservices container, or a one-click agent stack on Brev. That puts a 230-billion-parameter model inside NVIDIA’s preferred path for building and shipping long-running assistants. (build.nvidia.com, catalog.ngc.nvidia.com, developer.nvidia.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.