Vercel Adds Video Generation to AI Gateway

Vercel has announced the addition of video generation capabilities to its AI Gateway. The new feature, enabled by AI SDK 6, is designed to streamline developer workflows for creating multimodal applications, reflecting a broader industry trend toward integrating diverse media formats into AI-powered services.

- The initial launch provides access to four video models with 17 variations, including Grok Imagine from xAI, Wan from Alibaba, Kling, and Veo from Google. These models support a range of capabilities, from photorealistic quality and physics realism to reference-based generation for maintaining character identity across scenes. - The underlying AI SDK 6 enables video generation via an `experimental_generateVideo` function that supports text-to-video, image-to-video, and reference-to-video. This allows developers to programmatically control aspect ratio, duration, and resolution through a single, unified API across the different providers. - AI SDK 6 also introduces a new "Agent" abstraction, allowing developers to define reusable, stateful agents with specific models, instructions, and tools. This supports the development of more complex, autonomous workflows that can orchestrate multiple AI calls and actions. - For AI governance, a key feature in AI SDK 6 is "human-in-the-loop" tool execution approval. By setting a `needsApproval` flag, developers can require manual sign-off before an agent performs sensitive actions, a critical safety layer for enterprise-grade agentic systems that might modify data or incur costs. - The Vercel AI Gateway functions as a unified interface to a wide range of models from providers like OpenAI, Anthropic, Google, and Meta, not just for video but for text and images as well. This is designed to simplify production workloads by handling authentication and providing a single observability dashboard. - This integration reflects a major trend in enterprise AI, where multimodal systems are being adopted for applications in healthcare (analyzing medical images with patient records), retail (image-based search), and security (real-time video and audio analysis). The global multimodal AI market is projected to reach $3.43 billion in 2026. - The AI Gateway is designed for production-level traffic, featuring built-in failover mechanisms that automatically redirect requests to an alternative provider if one experiences an outage. The service also provides detailed logs, performance metrics, and cost-tracking analytics for each request.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.