Alibaba’s stealth video model leads benchmarks

A stealth video‑generation model from Alibaba reportedly topped global benchmarks, a sign that China’s AI research groups are pushing state‑of‑the‑art capabilities in generative media. The benchmark news adds competitive pressure to the global video‑AI race and suggests faster productisation of synthetic video tools (x.com).

A video model called HappyHorse-1.0 showed up on Artificial Analysis around April 7 without a company name attached, then jumped to the top of that site’s blind text-to-video rankings within days. On April 10, Alibaba confirmed the model came from its ATH AI Innovation Unit and said the project is still under development. (cnbc.com) Artificial Analysis ranks video models with blind voting, which means people judge clips without seeing which company made them. On its text-to-video leaderboard, HappyHorse-1.0 was listed at 1,355 Elo, ahead of ByteDance’s Dreamina Seedance 2.0 at 1,273 and Google’s Veo 3 at 1,221. (artificialanalysis.ai) That matters because video generation is now a crowded race between a few very large labs. The same leaderboard places models from ByteDance, Google, Runway, OpenAI, KlingAI, PixVerse, MiniMax, and Alibaba in direct side-by-side comparison on the same prompts. (artificialanalysis.ai) Alibaba was not starting from zero here. In February 2025, Alibaba Cloud open-sourced Wan2.1, said it offered 14-billion-parameter and 1.3-billion-parameter versions, and said Wan2.1 had reached the top of the VBench leaderboard with an overall score of 86.22%. (alibabacloud.com) VBench is a benchmark suite built by academic researchers to break video quality into separate tests instead of one vague score. Its public repository says it evaluates video generation across multiple dimensions and now includes VBench, VBench++, and VBench-2.0. (github.com) Alibaba has also been turning those research models into products. Its Model Studio documentation, updated March 23, 2026, lists Wan tools for text-to-video, image-to-video, reference-to-video, video editing, digital humans, image-to-action, and character swapping across regions including Virginia, Singapore, and Beijing. (alibabacloud.com) Three days before the reveal, Alibaba published a post for Wan2.7-Video that described four separate models inside one package: text-to-video, image-to-video, reference-to-video, and video editing. Alibaba said the system supports clips from 2 to 15 seconds and outputs at 720p and 1080p. (alibabacloud.com) The stealth launch also changed how people read Alibaba’s artificial intelligence push. CNBC reported that Alibaba Chief Executive Officer Eddie Wu has made artificial intelligence the company’s top priority, and that Alibaba has already been integrating models into e-commerce, advertising, and entertainment products. (cnbc.com) Investors noticed fast. Bloomberg reported that speculation around the anonymous model helped push Alibaba shares up as much as 8% on Wednesday before the company publicly acknowledged it was behind the project. (bloomberg.com) The bigger picture is that China’s video model makers are no longer playing catch-up on public scoreboards. On Artificial Analysis, the top text-to-video slots are now dominated by HappyHorse, ByteDance Seed, Skywork AI, KlingAI, and Alibaba’s own Wan line, which means the center of gravity in synthetic video is moving east at the same time these tools are getting packaged into commercial cloud products. (artificialanalysis.ai)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.