NVIDIA Develops Localized Language Model for Japanese Market
NVIDIA is focusing on localized AI with its Nemotron Nano 9B v2 Japanese model, an example of what is termed "sovereign AI." The model is designed for culturally-tuned applications in markets like Japan. This development reflects a trend toward creating smaller, efficient language models optimized for specific regions, languages, and industries, particularly for automotive and industrial edge deployments.
- This model achieves state-of-the-art performance in the sub-10 billion parameter category on the Japanese Nejumi leaderboard, outperforming models like Qwen3-8B. It was specifically customized from the base Nemotron-Nano-9B-v2 model using a dataset called Nemotron-Personas-Japan. - The development is part of a larger push by Japan's Ministry of Economy, Trade and Industry, which has allocated over $740 million to subsidize local firms in building out the nation's AI computing resources. This strategy aims to ensure Japan maintains control over its own data and AI development. - NVIDIA's partners in this sovereign AI initiative include major Japanese technology and telecommunications companies like SoftBank, Fujitsu, KDDI, and GMO Internet Group. - The base model uses a hybrid Mamba-Transformer architecture and was compressed from an original 12 billion parameters to 9 billion, a step taken to enable it to run efficiently on a single NVIDIA A10G GPU. - A key feature of the underlying Nemotron Nano architecture is a runtime "thinking budget" control, which lets a developer manage the trade-off between reasoning accuracy and latency by limiting the tokens used for internal processing before an answer is given. - The initiative extends to the local government level; Kagawa Prefecture is the first in Japan to sign a formal partnership with NVIDIA to expand AI use among local businesses and attract data center investment. - As part of Japan's broader AI infrastructure build-out, the national research institute RIKEN is deploying 2,140 NVIDIA Blackwell GPUs to power two new supercomputers for AI and quantum computing, scheduled to launch in the spring of 2026.