Microsoft’s MAI Models Preview
Microsoft announced public previews of first‑party MAI models—MAI‑Transcribe‑1, MAI‑Voice‑1, and MAI‑Image‑2—made available to developers via Azure Foundry. The push signals Microsoft’s intent to expand its in-house multimodal stack for dev integrations. (x.com)
MAI-Transcribe-1’s introductory price point is listed at $0.36 USD per hour, while MAI‑Voice‑1’s pricing starts at $22 per 1M characters; Microsoft also highlights a “Personal Voice” flow that can clone a voice from a roughly 10‑second sample. (techcommunity.microsoft.com) Microsoft reports MAI‑Transcribe‑1 achieved a 3.8% average word‑error‑rate on the FLEURS multilingual benchmark and claims it ranks first in 11 core languages while outperforming Whisper‑large‑v3 and Gemini 3.1 Flash across FLEURS’ 25‑language set. (microsoft.ai) MAI‑Voice‑1 is described as extremely low‑latency, able to render a full minute of audio in under one second on a single GPU, and MAI‑Transcribe‑1 accepts MP3, WAV and FLAC uploads up to 200MB with batch transcription that Microsoft says is roughly 2.5x faster than its prior Azure Fast offering. (techcommunity.microsoft.com) MAI‑Image‑2 debuted into the top three on Arena.ai’s text‑to‑image leaderboard and Microsoft says the model prioritizes photorealism, stronger text rendering inside images, and creative‑workflow fidelity while the MAI Playground lets users try the model and the company is doing a staged rollout into Copilot and Bing Image Creator. (microsoft.ai) Microsoft positions the three models for production voice and visual workflows—citing IVR, live agent assist and post‑call summarization for audio, and photographer/designer feedback guiding Image‑2’s creative features—while surfacing Copilot Labs and podcast integrations as early product touchpoints. (techcommunity.microsoft.com) The launch follows reporting that a late‑2025 revision to Microsoft’s OpenAI agreement cleared the way for Microsoft to build and ship its own frontier models, a strategic shift that industry coverage frames as positioning Microsoft to compete more directly with OpenAI and Google. (forbes.com)