Japan Releases Commercial-Ready Japanese Voice AI
Japan's National Institute of Informatics has released LLM-jp-Moshi-v1, which it describes as the world's first commercially usable full-duplex Japanese voice dialogue model. The model was trained on 1,000 hours of conversational data to enable natural, human-like dialogue. Its full-duplex capability allows it to process audio and speak simultaneously.
- The model is an adaptation of an English-language full-duplex voice AI called "Moshi," which was developed by the French non-profit AI research lab Kyutai. The Japanese version was developed over approximately four months by a team at Nagoya University's Higashinaka Laboratory. - Full-duplex capability is key for natural Japanese conversation as it allows the model to handle "aizuchi" – short interjections like "I see" or "that's right" that signify active listening – which traditional, non-simultaneous AI has struggled with. - The model was trained on multiple Japanese speech datasets, the largest of which was J-CHAT, a publicly available collection of about 67,000 hours of audio from sources like podcasts and YouTube. Researchers also supplemented this with smaller, higher-quality dialogue datasets and synthetically generated speech data. - LLM-jp-Moshi-v1 is released under an Apache 2.0 license, making it available for commercial use. However, running the model requires a Linux machine with a GPU that has at least 24GB of VRAM; it is not compatible with macOS. - This release is part of a larger, cross-organizational project called "LLM-jp" which involves over 1,500 participants from both academia and industry, coordinated by Japan's National Institute of Informatics, to develop open-source Japanese large language models. - The developers note that while the model is publicly available, it is still a prototype and its responses may be unnatural at times. - Potential applications being explored for the technology include language learning assistance, as well as commercial uses in call centers, customer service, and healthcare settings.