ASI-Arch finds 106 architectures

- Researchers from Shanghai Jiao Tong University and collaborators said their ASI-Arch system autonomously ran 1,773 experiments and found 106 new linear-attention neural network architectures that beat human-designed baselines. - The team said ASI-Arch used more than 20,000 GPU hours to generate, code, train, and test candidates end to end, then open-sourced the 106 architectures and the research pipeline. - Stanford’s 2026 AI Index says U.S. private AI investment still leads China 23-to-1, even as frontier-model performance gaps narrowed sharply. (hai.stanford.edu)

Neural-network architecture is the blueprint of an AI model: it decides how information moves, which parts attend to what, and how much compute each token costs. (arxiv.org) The paper behind this story says an autonomous system called ASI-Arch searched for better blueprints on its own and produced 106 new linear-attention architectures that outperformed the team’s baselines. (arxiv.org) The authors — Yixiu Liu, Yang Nan, Weixian Xu, Xiangkun Hu, Lyumanshan Ye, Zhen Qin, and Pengfei Liu — posted the paper on arXiv on July 24, 2025. They describe ASI-Arch as a multi-agent system that proposes ideas, writes code, trains models, and evaluates results without a human hand-tuning each candidate. (arxiv.org) Linear attention is a family of model designs meant to cut the cost of handling long sequences. Standard attention compares every token with every other token; linear attention uses approximations so the cost grows more gently as context gets longer. (arxiv.org) The paper says ASI-Arch ran 1,773 autonomous experiments over more than 20,000 GPU hours. The repository published by the authors says it includes the full pipeline, a database layer, a knowledge base, and all 106 discovered architectures. (arxiv.org) (github.com) The authors frame this as a shift away from traditional neural architecture search, which usually explores spaces that humans define in advance. Their claim is that ASI-Arch can propose new design concepts instead of only tuning known templates. (arxiv.org) That claim is still at the preprint stage. arXiv hosts unreviewed papers, and the abstract page for this work shows a July 2025 submission rather than a peer-reviewed journal publication. (arxiv.org 1) (arxiv.org 2) The GitHub repository gives the project a wider footprint than a paper alone. As of April 26, 2026, it showed more than 1,200 stars and described the 106 models as open-sourced linear-attention architectures. (github.com) The backdrop is a tighter U.S.-China AI race. Stanford’s 2026 AI Index says U.S. private AI investment reached $285.9 billion in 2025 versus China’s $12.4 billion, while top-model performance gaps narrowed enough that the leaders traded places multiple times since early 2025. (hai.stanford.edu 1) (hai.stanford.edu 2) ASI-Arch does not show that machines can replace AI researchers across the board. It does show one research loop — invent, implement, train, test, repeat — being pushed further into software, with 106 new model blueprints as the receipt. (arxiv.org) (github.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.