Mistral Ships Voxtral TTS
Mistral released Voxtral TTS, an open-weights text-to-speech model engineered for edge deployment with sub-100ms startup times — positioning itself against ElevenLabs and other cloud-first voice vendors. The release makes it easier to prototype real-time, on-device voice assistants and accessibility tools. (techcrunch.com) (mezha.net)
Voxtral ships as a family: Mistral published a 3.4B-parameter TTS variant alongside larger Voxtral Small releases that appear on Hugging Face as 24B models and model cards with safetensors for download. (aiproductivity.ai) (huggingface.co) The release explicitly lists nine supported languages — English, French, German, Spanish, Dutch, Portuguese, Italian, Hindi and Arabic — in the company announcement and press coverage. (techcrunch.com) Mistral states the TTS model can create a custom voice from audio samples shorter than five seconds and preserve characteristics such as accent and inflection when cloning a voice. (techcrunch.com) The Voxtral model family and related artifacts are distributed under permissive Apache-2.0 terms on Hugging Face, with model files available in safetensors format for downstream use and commercial deployment. (huggingface.co) Mistral released complementary speech-understanding and ASR models in the Voxtral line — public materials reference a smaller 3B "Mini" style variant and larger 24B variants intended for higher-throughput workloads. (dev.to) (huggingface.co) Community ports and tooling appeared within hours on GitHub (example community repo: elyxlz/voxtral), and the official model cards mention deployment paths such as vLLM-based serving for production inference. (github.com) (huggingface.co) Mistral says it will offer Voxtral through its own API and positions the offering as lower-cost compared with existing commercial voice APIs, with company executives describing the unit economics as "a fraction" of current market options. (techcrunch.com) (gadgets360.com)