Huawei backfills Ascend AI ecosystem

- Huawei’s Ascend software stack is adding more third-party model support, with Huawei documentation and model pages showing Qwen2-VL and ChatGLM compatibility on Ascend. - Huawei’s MindIE support list includes ChatGLM3-6B-32K, while a Qwen2-VL-7B-Instruct page specifies deployment on Atlas 800I A2 hardware. - Huawei’s Ascend model pages and MindIE support tables are the next checkpoints for additions covering Phi and IBM Granite.

Huawei’s Ascend AI push is increasingly showing up not just in chips, but in software compatibility. A May 19 X post from AiChinaNews said Huawei was “aggressively backfilling” the Ascend ecosystem by adding native ports for outside models including Alibaba’s Qwen2-VL-7B-Instruct, Zhipu AI’s ChatGLM3-6B-32K, Microsoft’s Phi-3.1-mini and IBM’s Granite 20B code model. Huawei’s own developer documentation confirms at least part of that picture: Ascend’s MindIE inference stack already lists ChatGLM3-6B-32K among supported large language models, and Huawei’s model pages show Qwen2-VL-7B-Instruct packaged for Ascend deployment. That matters because the software layer has been one of the main questions around Ascend adoption. Huawei has long had Ascend hardware and the CANN and MindIE software stack, but developers also need model support, deployment scripts and service tooling if they are going to run popular open models on that hardware. Huawei describes MindIE as its inference engine for Ascend devices and says it is designed to support multiple model frameworks and deployment scenarios. (hiascend.com) ### What exactly is confirmed in Huawei’s own materials? Huawei’s MindIE 1.0.0 documentation lists ChatGLM3-6B-32K in its supported large-language-model table. The same table says the model supports MindIE Service and can run on Atlas 800I A2 inference products with 1, 2, 4 or 8 cards, and on Atlas 300I Duo inference cards with 1 or 2 cards. (hiascend.com) Huawei’s Ascend ModelZoo also has a dedicated page for Qwen2-VL-7B-Instruct. That page says the model is an Alibaba-developed vision-language model, updated on January 19, 2026, and that Huawei provides a MindIE image with prebuilt inference scripts for it. The page says deployment requires at least one Atlas 800I A2 32G server. Huawei’s MindIE service documentation separately uses Qwen2-VL as an example for service deployment. (hiascend.com) The document shows a chat-completions style API flow, environment setup, model dependency installation and configuration steps for serving the model through MindIE Service. ### What does “backfilling” mean in practice here? Huawei’s own documentation shows that “support” is not just a model name in a table. (hiascend.com) For Qwen2-VL-7B-Instruct, Huawei publishes container guidance, dependency files, hardware requirements, example scripts, service ports and performance examples. On the 32G Atlas 800I A2 configuration, Huawei’s page gives a sample throughput calculation of 43 tokens per second; on a 64G configuration, it gives 98.79 tokens per second under the listed test settings. (hiascend.com) That is the practical definition of ecosystem work: adapting model code, packaging dependencies, exposing service endpoints and documenting hardware assumptions so developers can deploy without building everything from scratch. Huawei’s documentation says MindIE works with ATB Models and supports service deployment, quantization and other inference features across supported models. (hiascend.com) ### What about Microsoft Phi-3.1-mini and IBM Granite 20B Code? AiChinaNews named Microsoft’s Phi-3.1-mini and IBM’s Granite 20B code model in its May 19 post, but I could not independently verify dedicated Huawei model pages for those two from the sources surfaced here. Huawei’s public support materials visible in these results clearly confirm Qwen2-VL and ChatGLM support, while the Phi and Granite claims remain attributable to the AiChinaNews post unless Huawei publishes matching model pages or adds them to its support tables. (hiascend.com) Huawei’s model support pages are also versioned and updated over time. The company’s model query tool says it provides model lists and links for Ascend training and inference scenarios, which makes those pages the clearest place to watch for additional ports. ### Which Huawei components are doing the work? Huawei’s MindIE stack sits at the center of this effort. Huawei says MindIE is its inference engine for Ascend and that it supports service deployment, model acceleration and multiple business scenarios. (hiascend.com) The supported-model tables reference ATB Models as the model repository used alongside MindIE, while ModelZoo pages provide model-specific packaging and scripts. (hiascend.com) The hardware references are also consistent. Huawei’s documentation repeatedly points to Atlas 800I A2 and Atlas 300I Duo systems for supported inference deployments, which suggests the company is standardizing model enablement around named Ascend server and card configurations rather than generic “Ascend-compatible” claims. ### What should readers watch next? (hiascend.com) Huawei’s next visible signal will be changes to its MindIE supported-model tables and Ascend ModelZoo entries. If Phi-3.1-mini and IBM Granite 20B Code are being added in the same way as Qwen2-VL, they should eventually appear as named support entries, model pages or deployment guides in Huawei’s Ascend documentation and model repository pages. (hiascend.com 1) (hiascend.com 2)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.