Text‑to‑Video Momentum

Multiple firms are expanding next‑generation video generation tools: Seedance 2.0 support rolled out more broadly across vendors — Dreamina live in the US, Runway on paid plans with a promo, and HeyGen globally — while Alibaba Cloud released HappyHorse‑1.0, which topped some text‑to‑video leaderboards and will be offered to enterprise customers. The flurry shows vendors racing on multi‑character scenes, realistic motion and enterprise access as the category shifts from demos to productised services. (x.com) (x.com) (x.com) (x.com)

A few months ago, most text-to-video tools were still best at one pretty shot. This week, three separate platforms pushed the same newer model into wider release, and the selling point was no longer “look what it can render” but “here is where you can actually buy and use it.” (dreamina.capcut.com) (runwayml.com) (heygen.com) The model in the middle of that rush is Seedance 2.0. Dreamina describes it as a video generator that can take text plus reference images, video, and audio, then produce coherent clips with style control and realistic motion instead of treating every shot like a fresh start. (dreamina.capcut.com 1) (dreamina.capcut.com 2) That sounds abstract until you compare it with older systems. Earlier generators often made a character’s face drift, changed clothes between cuts, or moved the camera like it was on a shopping cart with a broken wheel; Seedance 2.0 is being marketed around multi-shot sequences, locked characters, dialogue, and sound. (runwayml.com) (dreamina.capcut.com) Dreamina’s rollout matters because it puts Seedance 2.0 directly inside a consumer-facing creation app in the United States. Its live tool page says users can combine up to 12 reference assets in one project, including as many as 9 images, 3 video clips, and 3 audio clips, with video and audio inputs capped at 15 seconds each. (dreamina.capcut.com) Runway took a different route and turned Seedance 2.0 into a subscription feature. Its product page says the model is available on paid plans, accepts text, images, video, or audio as inputs, and was launched with a 50 percent discount code for three months that runs through April 13 at 9 a.m. Pacific Time. (runwayml.com) HeyGen’s version is not just another prompt box. In its April 7 post, HeyGen said Seedance 2.0 is fully integrated across its platform and tied to the company’s “digital twin” system, so a verified human likeness can be placed into cinematic scenes with shot-to-shot consistency. (heygen.com 1) (heygen.com 2) That is a different business from pure video generation. Runway and Dreamina are largely selling a model that makes scenes, while HeyGen is selling a workflow where the scene generator is connected to identity verification, avatars, and publish-ready marketing videos. (runwayml.com) (heygen.com) At the same time, Alibaba Cloud pushed the market from another direction with HappyHorse-1.0. Community benchmark pages tied to Artificial Analysis showed HappyHorse-1.0 landing at or near the top of text-to-video and image-to-video rankings in early April, which is why traders and creators suddenly started circulating the name. (huggingface.co) (github.com 1) (github.com 2) Alibaba already had a foothold in open video models before this moment. In 2025, Alibaba Cloud said its Wan2.1 family topped the VBench leaderboard and released open-source video foundation models, so HappyHorse-1.0 looks less like a random surprise and more like another move in a long campaign to win developers and enterprise customers. (alibabacloud.com) The pattern across all four launches is simple: the race has moved from single clips to systems that can hold a scene together. The features getting repeated are character consistency, multi-character shots, camera control, sound, and enterprise packaging, which is what you build when customers want ads, explainers, and product videos on a deadline instead of laboratory demos. (dreamina.capcut.com) (runwayml.com) (heygen.com) (alibabacloud.com) The next fight is not about whether a model can make a beautiful eight-second clip. It is about who owns the full stack around that clip: the subscription plan, the avatar system, the enterprise contract, the benchmark bragging rights, and the workflow that gets a usable video out the door before a human editor gives up and does it by hand. (runwayml.com) (heygen.com) (github.com)

Text‑to‑Video Momentum

Get your own daily briefing