vLLM adds MiniMax M2.7 support
vLLM announced day‑zero support for the open‑source MiniMax M2.7 model, which the MiniMax team claims is state‑of‑the‑art on the SWE‑Pro benchmark at 56.22% and is designed for agentic multi‑agent orchestration ( ). The integration signals runtime vendors are prioritising models tuned for coding and office‑automation workloads and aiming for efficient production serving (x.com).
vLLM has added support for MiniMax M2.7, putting the newly released open-weights model into one of the most widely used open-source serving stacks on day one. (docs.vllm.ai) MiniMax published M2.7 on March 18, 2026 and described it as an open-source model built for “Agent Teams,” tool use, and long, multi-step work in software and office applications. Its GitHub repository says the model can build agent harnesses, use dynamic tool search, and handle complex productivity tasks. (minimax.io) (github.com) In plain terms, vLLM is the software layer that lets companies run large language models in production, the way a web server runs a website. Its documentation now includes a dedicated `minimax_m2` model entry, and its reasoning-model guide lists MiniMax-M2 with tool-calling support. (docs.vllm.ai 1) (docs.vllm.ai 2) MiniMax is pitching M2.7 as a model for coding agents rather than a chatbot that only writes snippets on request. The company reported a 56.22% score on SWE-Bench Pro, 55.6% on VIBE-Pro, and 57.0% on Terminal Bench 2 in its launch materials and repository. (minimax.io) (github.com) SWE-Bench Pro is a benchmark for software agents that has models fix real repository issues instead of answering short coding questions. The public project page says it covers 1,865 problems from 41 repositories, with long-horizon tasks that can take human engineers hours or days. (scaleapi.github.io) That context matters because infrastructure vendors have spent the past year racing to support models that can call tools, keep state across long tasks, and fit into automated workflows. vLLM’s own reasoning-output documentation now groups MiniMax-M2 with models that expose separate reasoning fields and structured tool-calling behavior. (docs.vllm.ai) MiniMax is also framing M2.7 as a model for office software, not just code. The company said it scored 1495 on GDPval-AA, handled multi-round editing in Word, Excel, and PowerPoint, and maintained a 97% skill-adherence rate across more than 40 complex skills in its internal evaluation setup. (minimax.io) (github.com) Some of the headline numbers remain company-reported rather than independently reproduced in the same setup. The public SWE-Bench Pro page currently shows results from a unified scaffold that differ from vendor self-reports, and says initial runs are subject to change pending an official announcement from Scale. (scaleapi.github.io) MiniMax’s larger claim is that M2.7 helped improve its own training and agent scaffold during development. Its repository says an internal version of the model ran more than 100 optimization rounds on a programming scaffold and delivered a 30% performance improvement in that internal loop. (github.com) For developers, the immediate change is simpler: a new model can move from release to deployment without waiting for custom serving work. For vLLM, adding MiniMax M2.7 quickly is a bet that coding agents and office-automation agents are becoming core production workloads, not side demos. (docs.vllm.ai 1) (docs.vllm.ai 2)