New AI Learns Software by Watching Videos

Standard Intelligence has unveiled FDM-1, a foundation model trained on 11 million hours of screen recordings that can learn to operate any software by watching user interactions. The model offers a reported 50-fold leap in visual context, enabling it to reverse-engineer user actions from video frames. FDM-1 has already demonstrated capabilities in using 3D graphics software like Blender, identifying bugs, and autonomous driving, marking a shift from text-based AI to visually-grounded task execution.

- Standard Intelligence, the company behind FDM-1, was founded in 2017 and previously focused on AI-powered autonomous checkout for brick-and-mortar retailers, raising $236 million in funding with a valuation of $1 billion as of its Series C round in 2021. - The model's ability to process long video streams comes from a video encoder that compresses nearly two hours of screen activity into a more manageable size for the AI to analyze, moving beyond traditional methods that rely on static screenshots. - FDM-1 is trained to predict the next user action by analyzing both the visual information on the screen and the history of prior interactions, allowing it to learn continuous actions like dragging and scrolling. - The demonstration of FDM-1 controlling a real vehicle in San Francisco was achieved with less than an hour of task-specific fine-tuning, using live visual feeds and keyboard inputs to navigate public streets. - This type of technology, often categorized as a Vision-Language Model (VLM), represents a shift in AI by integrating computer vision with language understanding to interpret and act upon visual data without task-specific training. - The model's capability to identify software bugs was demonstrated through automated exploration of user interfaces, showcasing its potential for quality assurance and software testing roles.

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.