Mistral cuts inference costs

Mistral’s Small 3.1 now matches GPT‑4o mini on performance while running at lower inference cost — a direct pricing and efficiency pressure point for larger providers. That shift means model choice will increasingly be an engineering cost decision, not just a capability one. (arturmarkus.com)

Mistral published Small 3.1 on March 17, 2025 as a 24‑billion‑parameter multimodal model with a 128k‑token context window and an Apache‑2.0 license. (mistral.ai/news/mistral-small-3-1; huggingface.co) The company reported inference throughput near 150 tokens/second for the Small‑3 lineage and stated Small 3.1 outperformed comparable models including GPT‑4o mini on its internal benchmarks. (mistral.ai/news/mistral-small-3-1; learnprompting.org) Mistral’s cloud pricing for the Small family is listed at $0.10 per 1M input tokens and $0.30 per 1M output tokens on Mistral’s model pages and related release notes. (docs.mistral.ai/models/mistral-small-3-2-25-06; mistral.ai/news/devstral) OpenAI publishes GPT‑4o mini at $0.15 per 1M input tokens and $0.60 per 1M output tokens, so Mistral’s $0.10/$0.30 list implies roughly ~33% lower input cost and ~50% lower output cost versus GPT‑4o mini. (openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence; docs.mistral.ai/models/mistral-small-3-2-25-06) Model weights and instruction checkpoints for the Small‑3 series are available for download on Hugging Face and Mistral explicitly notes self‑hosting and deployment options (Ollama, LM Studio, on‑prem) for the same small‑model family. (huggingface.co/Mistral‑Small‑3.1‑24B‑Base‑2503; mistral.ai/news/devstral) Third‑party trackers and benchmarks show Mistral Small‑3 variants scoring in the low‑80s on MMLU (Mistral reported ~81% for Small‑3) and place Small‑3.x and GPT‑4o mini at comparable positions on aggregated intelligence and throughput charts. (mistral.ai/news/mistral-small-3; llm-stats.com; artificialanalysis.ai) Mistral’s docs show Small 3.1 was retired with a scheduled replacement by Small 3.2 (release notes list Small 3.2 on June 20, 2025 and Small 3.1 retirement on Nov 30, 2025), and the Small‑3 lineage continues to be listed with the $0.10/$0.30 pricing tier in Mistral’s model pages. (docs.mistral.ai/models/mistral-small-3-1-25-03; docs.mistral.ai/models/mistral-small-3-2-25-06)

Mistral cuts inference costs

Get your own daily briefing