Next Frontier in AI Seen as 'World Models'
Industry analysts are increasingly focused on the development of "world models" as the next major frontier for AI, according to a recent report. These models, including upcoming video generation systems, aim to treat creative pipelines like physics engines. The goal is to produce richer, more controllable, and context-aware outputs that could significantly enhance creative workflows.
- The concept of world models draws from Scottish psychologist Kenneth Craik's 1943 theory that the human mind builds internal, small-scale models of reality to anticipate events. A significant milestone in applying this to AI was the 2018 paper "World Models" by David Ha and Jürgen Schmidhuber, which demonstrated an agent could learn to navigate a simulated environment by "dreaming" within its own generated model. - Proponents like Meta's former Chief AI Scientist Yann LeCun argue that world models are a crucial step toward more advanced AI, as they enable a form of common-sense understanding of physics and causality that Large Language Models (LLMs) lack. LeCun, who recently left Meta to start a new company focused on world models, suggests they will become the dominant AI architecture, relegating LLMs to a supporting role for communication. - In creative applications, world models are being explored to generate not just realistic video, but also interactive 3D worlds with temporal and spatial consistency. Companies like Runway are developing "General World Models" and have partnered with NVIDIA and Lionsgate to advance their use in film production and simulation. - The rise of generative AI, including world models, is shifting creative workflows from direct execution to strategic curation and oversight. Professionals are increasingly using AI for initial ideation and automating repetitive tasks, allowing more time for high-level creative decisions and strategy. - The question of authorship is a central debate, with legal and philosophical discussions focusing on whether AI is a tool or a collaborator. Current legal frameworks are being challenged, prompting calls for clear guidelines on copyright for AI-assisted works and compensation for artists whose work is used in training data. - For builders creating AI tools, interoperability between different systems is a major focus, enabling practitioners to chain together various specialized AI models (e.g., for image generation, coding, and design) into a single, cohesive workflow. This "multi-tool" approach avoids being locked into a single platform and allows for more flexible and powerful creative pipelines. - The development of robust world models is computationally intensive, requiring the analysis of petabytes of video and image data and costing millions of dollars in GPU resources for training. This has led to the development of benchmarks like WorldScore to evaluate a model's ability to generate controllable and consistent virtual worlds. - Beyond creative fields, world models are critical for training robots and autonomous vehicles by allowing them to simulate and predict outcomes of actions in a safe virtual environment before real-world deployment. This reduces the need for costly and risky physical trials.