Mistral ships Large 3 with 128K-token context for long-context code tasks

- Mistral AI said on December 2, 2025 that it released Mistral Large 3, an open-weight flagship model with mixture-of-experts architecture. (mistral.ai) - The clearest disclosed number is a 256K context window; Mistral also said the model uses 41B active parameters and 675B total. (mistral.ai) - Mistral said a reasoning version is coming soon, and model details are listed on its model card and product pages. (mistral.ai)

Mistral AI’s own materials do not support the framing in the prompt that Large 3 “ships” as a new May 2026 release with a 128K context window for code tasks. Mistral said it introduced Mistral Large 3 on December 2, 2025 as part of its Mistral 3 family, and its current model pages describe it as an open-weight, general-purpose flagship multimodal and multilingual model. (mistral.ai 1) (mistral.ai 2) The company’s official pages list a 256K context window for Mistral Large 3, not 128K. They also describe the model as a sparse mixture-of-experts system with 41 billion active parameters and 675 billion total parameters. (mistral.ai) ### Did Mistral actually launch Large 3 this week? December 2, 2025 is the date on Mistral’s blog post and model card for Mistral Large 3. The company said at the time that the release included both base and instruction fine-tuned versions under the Apache 2.0 license. (mistral.ai) Mistral’s current models page still lists Large 3 as its largest model to date. The page presents it as part of the company’s active lineup rather than a fresh May 2026 launch. ### Where does the 128K claim come from? (mistral.ai) Mistral’s official Large 3 pages do not show a 128K context window. The company’s product page and model card both show 256K context for Mistral Large 3. A 128K figure does appear elsewhere in Mistral’s ecosystem for other models, but not on the Large 3 materials reviewed here. (mistral.ai) Based on Mistral’s own documentation, the prompt’s 128K specification for Large 3 appears inconsistent with the company’s current published specs. (mistral.ai) ### Is Large 3 positioned mainly as a coding model? Mistral describes Large 3 as a “general-purpose, flagship multimodal and multilingual model,” not as a code-only model. In its launch post, the company said the model reached parity with leading instruction-tuned open-weight models on general prompts and showed strong multilingual conversation and image understanding. (mistral.ai) The same launch post said NVIDIA, vLLM and Red Hat worked with Mistral on deployment and inference support, including serving for “long-context, high-throughput workloads.” That language supports a long-context positioning, but Mistral’s official text does not present Large 3 primarily as a code-review model in the way the prompt suggests. (docs.mistral.ai) ### What did Mistral say about deployment and openness? Apache 2.0 is the license Mistral lists for Large 3. The company said it released compressed formats and an NVFP4 checkpoint to make the model easier to run, including on Blackwell NVL72 systems and on a single 8xA100 or 8xH100 node using vLLM. (mistral.ai) NVIDIA’s support for TensorRT-LLM and SGLang across the Mistral 3 family is also described in the launch post. Mistral said those optimizations were aimed at efficient low-precision execution and long-context serving. (mistral.ai) ### What is the cleanest takeaway from the official record? Mistral’s official record shows that Large 3 is an existing model first announced on December 2, 2025, with a 256K context window and open-weight Apache 2.0 release terms. The company says a reasoning version is “coming soon,” and its model card remains the clearest source for current specs. (mistral.ai)

Mistral ships Large 3 with 128K-token context for long-context code tasks

Get your own daily briefing