AI systems show rapid gains on key benchmarks

Artificial intelligence systems are achieving rapid gains on demanding benchmarks like MMMU, GPQA, and SWE-bench, according to data from the 2025 Stanford Index. These technical advances are catalyzing strategic shifts in business and national competitiveness. The progress is pressuring organizations to integrate advanced AI into their workflows, fueling both innovation and debates over automation.

The massive performance gains on new benchmarks are stark. In just one year, AI scores on GPQA, a test of graduate-level reasoning, jumped by 48.9 percentage points. On the SWE-bench for coding, the success rate soared from 4.4% to 71.7%. This progress is tightening the field of top AI developers. The performance gap between the best and tenth-best models has been cut in half, and the top two models are now separated by less than a percentage point. The quality gap between leading U.S. and Chinese models has also narrowed to near parity on some key benchmarks. Industry adoption of AI has accelerated significantly, with 78% of organizations reporting AI use in 2024, a sharp increase from 55% the previous year. This is mirrored in the job market, where demand for generative AI skills in U.S. job postings grew by nearly four times compared to 2023. Fueling this integration is a massive influx of capital. Global private investment in generative AI alone reached $33.9 billion, an 18.7% increase from 2023. Total corporate investment in AI hit $252.3 billion in 2024, with U.S. private investment far outpacing other nations. Simultaneously, the cost of AI is plummeting. Driven by more efficient models and hardware, the expense of running an AI system at the level of GPT-3.5 dropped by a factor of over 280 between late 2022 and late 2024. Hardware energy efficiency has been improving by 40% each year. As capabilities grow, so do concerns over misuse. AI-related incidents are rising sharply, prompting increased urgency from governments. In 2024, global organizations including the EU, OECD, and the UN intensified cooperation on AI governance, releasing frameworks focused on trustworthiness and transparency.

AI systems show rapid gains on key benchmarks

Get your own daily briefing