Meta reportedly opens MoE multimodal models

Reports say Meta launched open‑weight Llama 4 Scout and Maverick—multimodal mixture‑of‑experts models pushing MoE into the open ecosystem. (evermx.com) Scout is described as 17B active parameters with 16 experts and a 10M context window, while Maverick is 17B active with 128 experts—moves that could change private deployment economics and retrieval design. (evermx.com)

Meta has reportedly opened the weights for two new Llama 4 models, Scout and Maverick, and the important part is not just that they are new. It is that Meta appears to have pushed a frontier design choice into the open stack. Both models are natively multimodal, so they handle text and images in one system, and both use a mixture-of-experts architecture, which activates only part of the model for each request instead of running the whole thing every time. Meta announced the pair on April 5, 2025, and made them available through its own Llama channels and Hugging Face under the Llama 4 Community License. (about.fb.com) That architecture choice is the story. Scout is described as a 17 billion active-parameter model with 16 experts. Maverick is also 17 billion active parameters, but spread across 128 experts, with roughly 400 billion total parameters behind the routing system. Hugging Face’s release notes describe Scout at about 109 billion total parameters and Maverick at about 400 billion total, while keeping the active slice at 17 billion for each token path. That is how Meta is trying to square a familiar circle in AI: bigger capability without paying the full compute cost of a dense model on every pass. (huggingface.co) The second surprise is context length. Meta described Scout as having a 10 million token context window, an absurdly large number by the standards of mainstream deployment. Even if few teams will use that full span in practice, the claim matters because it changes what developers can even attempt. A model with that kind of memory can blur the line between model context and retrieval system. Some workflows that once needed aggressive chunking, ranking, and stitching can instead throw much larger swaths of raw material straight into the prompt. That does not kill retrieval. It changes retrieval from a hard bottleneck into a design choice. (aifirstfounders.com) That is why the open-weight angle matters more than the benchmark chest-thumping. Meta has been explicit for months that it wants Llama to become the industry standard for open AI, much as Linux became the standard substrate for computing. Releasing open-weight MoE multimodal models is a concrete move toward that goal. It gives companies a way to experiment with private deployment, custom fine-tuning, and on-prem inference using a model class that had been far more associated with closed labs and hosted APIs. The economics are different when the weights can sit inside your own stack. (about.fb.com) Meta also framed both models as distilled from a larger teacher model, Llama 4 Behemoth, which it said was still in training at the time of release. In Meta’s telling, Behemoth had 288 billion active parameters and 16 experts, and served as the source of some of the smaller models’ performance. That matters because it hints at the playbook. Train something huge and expensive once. Distill the useful behavior into smaller systems that more people can actually run. Open weights make that trick much more valuable to everyone outside Meta too. (about.fb.com) The hardware details make the release feel less like a research demo and more like an attempt to seed an ecosystem. Meta said Scout could fit on a single H100 with Int4 quantization, while Maverick could run on a single H100 host. Hugging Face said Scout was integrated into transformers and TGI from day one, with support for both base and instruction-tuned checkpoints. So the launch was not just a paper claim about model design. It arrived with downloadable weights, deployment hooks, image-text support, and a license that even tells redistributors to display the phrase “Built with Llama.” (about.fb.com)

Meta reportedly opens MoE multimodal models

Get your own daily briefing