YouTube flags multi‑agent code tools
- OpenAI rolled out GPT-5.5 to ChatGPT and Codex on April 23, while Anthropic and Microsoft kept pushing multi-agent coding tools into production. - Anthropic says Claude Code can now assemble agent teams, and one internal stress test used 16 agents across nearly 2,000 sessions. - The shift is from one coding model to orchestrated stacks inside editors, terminals, and cloud agents. (code.visualstudio.com)
A coding agent is software that can read a repository, run commands, edit files, and test its own work instead of just suggesting snippets. OpenAI, Anthropic, Microsoft, and Google are now shipping that model into developer tools. (openai.com) (anthropic.com) (code.visualstudio.com) (ai.google.dev) The newest hard news is OpenAI’s April 23 launch of GPT-5.5 in ChatGPT and Codex. OpenAI said the model is rolling out to paid ChatGPT tiers and is the recommended choice in Codex for implementation, refactors, debugging, testing, and validation. (openai.com) (developers.openai.com) OpenAI also said GPT-5.5 scored 82.7% on Terminal-Bench 2.0, up from 75.1% for GPT-5.4. In the same release, the company said GPT-5.5 matches GPT-5.4 latency while improving coding, tool use, and computer-use tasks. (openai.com 1) (openai.com 2) Anthropic has moved in parallel. Its Claude Code product page says the tool reads whole codebases, makes multi-file changes, runs tests, and commits code, and Anthropic says most of its own code is now written by Claude Code. (anthropic.com) Anthropic’s documentation now splits the work into subagents and agent teams. Subagents handle narrow side tasks in separate context windows, while agent teams coordinate across separate sessions when multiple agents need to work in parallel. (code.claude.com) The company tied that feature to a model launch in February. In its Claude Opus 4.6 announcement, Anthropic said Claude Code users can “assemble agent teams to work on tasks together.” (anthropic.com) Anthropic has also published one of the clearest examples of what that looks like at scale. In a stress test, it said 16 agents produced a 100,000-line Rust C compiler over nearly 2,000 Claude Code sessions at a cost of about $20,000. (anthropic.com) Microsoft is turning that vendor competition into a distribution layer. In a February 5 post, the Visual Studio Code team said users can run Claude and Codex agents alongside GitHub Copilot, locally or in the cloud, and manage them in one Agent Sessions view. (code.visualstudio.com) Google is making the same pitch from the model side. Its Gemini 3 developer materials describe the family as built for agentic workflows and autonomous coding, and Google says Gemini 3 Pro scored 54.2% on Terminal-Bench 2.0. (ai.google.dev) (blog.google) What the YouTube chatter gets right is the direction of travel, not the unverified product names. Public materials support Codex-native coding models, Claude Code subagents and agent teams, and Gemini models tuned for agentic coding; they do not support the specific claim that Google is preparing “Gemini 3.5 Pro.” (openai.com) (code.claude.com) (blog.google) The practical change for buyers is visible in the product docs already. OpenAI recommends different Codex models for different jobs, Anthropic routes work to faster subagents like Haiku, and Microsoft is building interfaces to juggle several agents at once. (developers.openai.com) (code.claude.com) (code.visualstudio.com) The race is no longer about a single chatbot writing one file at a time. It is about which company can supply the model, the agent harness, and the editor workflow that lets several coding agents work in parallel without losing the thread. (openai.com) (anthropic.com) (code.visualstudio.com)