Cohere open‑sources 2B ASR

Cohere released a 2B-parameter open‑source ASR model called Transcribe that supports 14 languages and runs locally on GPU at roughly 3x real‑time factor—designed for offline voice prototypes and privacy‑sensitive apps. The model is pitched as a fast, local alternative to cloud ASR for audio projects. (x.com)

Cohere published the model and accompanying release notes on March 26, 2026, and the weights and repo are available on the Hugging Face hub under CohereLabs/cohere-transcribe-03-2026. (cohere.com)) The release is licensed under Apache 2.0 and shipped with a permissive license file and repo artifacts on the Hugging Face page and mirrored GitHub repository. (huggingface.co)) Cohere reported an average word error rate (WER) of 5.42% on the Hugging Face Open ASR Leaderboard and positioned that result as outperforming OpenAI’s Whisper Large v3 (reported ~7.44% WER), a roughly 27% relative improvement. (cohere.com)) The model’s design uses a Conformer-style encoder paired with a lightweight Transformer decoder, and community tooling notes the decoder occupies roughly 151M parameters while the encoder contains the bulk of the model. (huggingface.co)) Packaged APIs and examples include a single model.transcribe() entry that auto-chunks long audio for offline inference, plus documentation showing availability via Cohere’s Audio Transcriptions API for hosted usage. (github.com)) Cohere framed the launch as its first dedicated voice product aimed at production transcription stacks—calling out meeting transcription, speech analytics and customer-support workflows as primary targets in the company blog and press coverage. (cohere.com))

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.