OpenAI Reports 70% More PRs with AI
OpenAI's Head of Platform Engineering shared internal metrics revealing a massive developer productivity gap. AI-assisted engineers open 70% more pull requests, and 100% of PRs are now reviewed by AI before human eyes. The new paradigm is shifting from developers to "wizards" who manage fleets of agents.
The productivity gains from AI are not just about speed; they represent a fundamental shift in the developer's role. At OpenAI, 95% of engineers now use their internal AI tool, Codex, on a weekly basis. This widespread adoption has led to a significant drop in the average time for pull request reviews, from a 10-15 minute task to just 2-3 minutes. This transition redefines the engineer's job from writing code line-by-line to orchestrating fleets of AI agents. Instead of crafting individual components, developers now manage multiple parallel AI-driven coding threads, focusing on high-level logic and strategic direction. This "wizard" role emphasizes skills in prompt engineering and delegating tasks to AI, a different skill set from traditional software development. The impact on standard engineering metrics is profound, prompting a re-evaluation of frameworks like DORA (DevOps Research and Assessment). When AI can generate code and recovery scripts in seconds, traditional metrics such as deployment frequency and mean time to recovery (MTTR) lose their original meaning. The focus is shifting from measuring the velocity of human coding to the velocity of validated ideas reaching customers. This new paradigm extends beyond code generation to the entire development lifecycle, including debugging, testing, and documentation. AI agents can now analyze logs, suggest root causes for incidents, and even generate test cases, significantly reducing manual toil for SRE and DevOps teams. Organizations are reporting tangible benefits, such as a 40-60% reduction in alert noise and a 50-70% faster incident resolution time. However, the adoption of AI also introduces new challenges. A 2024 DORA report highlighted a paradox where a 25% increase in AI adoption was correlated with a 7.2% decrease in delivery stability. This suggests that without mature platform engineering and clear governance, the increased volume of AI-generated code can introduce unforeseen risks, making human oversight and judgment more critical than ever.