Deepgram Adds Arabic Support to Voice AI
Voice AI company Deepgram has launched support for the Arabic language in its Nova-3 model. The update includes coverage for 17 different dialects, with a demonstration showing its accuracy for use in call centers and with virtual agents.
- The addition of Arabic is a significant step for Deepgram, as it's the first right-to-left language supported by the Nova-3 model. This required engineering adjustments to handle the different script and text direction. - The 17 Arabic dialects covered are grouped into major regional categories: Gulf, Levantine, Egyptian/Nile, Maghrebi, Mesopotamian, and Pan-Arab/Modern Standard Arabic (MSA). This allows for more accurate transcription of conversational language, as most people do not speak formal MSA in daily life. - Transcribing Arabic dialects is technically challenging due to significant variations in pronunciation, vocabulary, and even grammar between regions. Unlike MSA, many dialects have no standardized written form, making it difficult to train accurate AI models. - Deepgram's benchmark testing on conversational Arabic showed that the Nova-3 model achieves up to a 40% lower Word Error Rate (WER) compared to competing speech-to-text systems across various dialects. - This language expansion is part of a broader trend for the company; in June 2024, Deepgram expanded its previous model, Nova-2, to support 36 languages. - The new model includes a "Keyterm Prompting" feature for all Arabic dialects, which allows developers to provide a list of specific terms to improve recognition of jargon, brand names, or industry-specific vocabulary. - This level of dialect support is critical for use cases like call center analytics and voice agents in the Middle East and North Africa, where understanding the nuances of local dialects directly impacts customer experience. - While competitors like Google and Microsoft also offer Arabic support, they have historically focused more on Modern Standard Arabic, which is primarily used in formal writing and news broadcasts, not in everyday conversation.