AI2 Releases AutoDiscovery System for Hypothesis Generation
The Allen Institute for AI (AI2) has introduced AutoDiscovery, an AI system that uses surprise-driven exploration to generate novel research hypotheses from large datasets. The system follows an "explore–hypothesize–plan" loop, offering a potential blueprint for agentic reasoning and structured task decomposition. It is designed to analyze data to find hidden connections that human researchers might miss.
- The system's core logic is guided by two key principles: Bayesian surprise, to measure the shift between the AI's prior belief and its posterior belief after an experiment, and Monte Carlo Tree Search (MCTS), to efficiently navigate the vast space of potential hypotheses. This MCTS approach was found to outperform other search mechanisms by up to 29% in discovering surprising hypotheses. - AutoDiscovery differentiates itself from other AI research systems, like Google's AI co-scientist, by not requiring an initial human-provided research question. It operates in an open-ended, "data-first" manner, generating its own questions directly from a given dataset. - The system is now available as an experimental feature within AstaLabs, which is part of the Allen Institute for AI's broader "Asta" agentic AI framework for science. The project, formerly known as AutoDS, originated as a research project with open-source code before being integrated into the Asta platform. - Researchers have already used the system to generate novel findings, including identifying mutual-exclusivity patterns in cancer mutations and uncovering trophic relationships from 20 years of marine ecosystem data. Some of its findings in social science were independently verified and published in a peer-reviewed paper. - The "explore–hypothesize–plan" loop is a practical implementation of agentic reasoning frameworks, which are foundational structures for coordinating multi-agent AI systems. Other notable open-source frameworks for building such agentic workflows include Microsoft's AutoGen and the popular LangChain library. - A human evaluation study found that two-thirds of the discoveries made by the system were also considered surprising by domain experts, suggesting its "Bayesian surprise" metric is an effective proxy for human curiosity and scientific novelty. - For reproducibility and verification, the platform allows users to inspect the full hypothesis, the statistical analysis, and the exact Python code generated and used for each experiment. - Initial access for new users is being provided through a credit system, with 1,000 "Hypothesis Credits" available to researchers to encourage experimentation on the platform.