Tom’s Guide builds 3 apps vs agents
- Tom’s Guide published a May 17 hands-on comparison showing Anthropic’s Claude Code and OpenAI’s Codex building three applications with limited human input. - OpenAI said Codex tasks typically take 1 to 30 minutes, while Anthropic describes Claude Code as “agentic, not autocomplete.” - Tom’s Guide’s comparison is available on its May 2026 archive page, alongside OpenAI and Anthropic product documentation.
Tom’s Guide published a hands-on comparison on May 17 that tested Anthropic’s Claude Code against OpenAI’s Codex by building three applications with limited human input. The article framed both products as coding agents rather than autocomplete tools, reflecting how Anthropic and OpenAI now describe their own systems in product materials. OpenAI says Codex can handle software tasks in parallel in isolated cloud environments, while Anthropic says Claude Code can read a codebase, make changes across files, run tests and deliver committed code. ### Which products was Tom’s Guide comparing? Tom’s Guide’s May 17 archive entry lists the piece as “Claude Code vs. OpenAI Codex: I built 3 real apps to find the better agent — here’s the verdict.” The wording matters because both companies now market these products as agents that can take on multi-step engineering work rather than just generate snippets on command. Anthropic says Claude Code is “Anthropic’s agentic coding system” and describes it as software that reads codebases, makes multi-file changes, runs tests and completes tasks autonomously. (tomsguide.com) OpenAI describes Codex as a coding agent for “real engineering work,” including features, refactors, migrations and pull requests, with support for multi-agent workflows. ### What can Codex do without a developer stepping in? (tomsguide.com) OpenAI said in its May 16, 2025 product launch that Codex can write features, answer questions about a codebase, fix bugs and propose pull requests for review. The company said each task runs in its own cloud sandbox environment preloaded with the repository, and that completion typically takes between 1 and 30 minutes depending on complexity. (anthropic.com) OpenAI also said users can review the results, request revisions, open a GitHub pull request or integrate the changes locally after a task finishes. The company added that Codex works better when repositories include configured development environments, testing setups and AGENTS.md guidance files. ### How does Anthropic describe Claude Code’s role? Anthropic says Claude Code can search directories, understand how modules connect, edit files across a codebase and use tools such as the GitHub CLI. (openai.com) The company also says the system can read test failures, fix code and rerun the suite until tests pass. Anthropic’s product page goes further, saying “the majority of code” at Anthropic is now written by Claude Code, with engineers focusing on architecture, product thinking and orchestrating multiple agents in parallel. (openai.com) The company cites customer examples including Stripe deploying Claude Code across 1,370 engineers and Ramp cutting incident investigation time by 80%. (anthropic.com) ### Why are both companies talking about orchestration now? OpenAI said on February 2, 2026 that it introduced the Codex app for macOS as “a command center for agents,” later updating that Windows support arrived on March 4, 2026. The company said the app was designed to manage multiple agents at once, run work in parallel and support long-running tasks across projects. (anthropic.com) OpenAI said developers are now delegating work, running tasks in parallel and trusting agents with projects that can last hours, days or weeks. Anthropic uses similar language, saying engineers increasingly direct multiple agents rather than write every line themselves. Tom’s Guide’s three-app test fits that shift because it examined how these systems behave across build, debugging and workflow tasks with less direct intervention. (openai.com) The practical issue, based on the product documents from both companies, is supervision: review, testing and routing remain built into how the tools are supposed to be used. ### Where can readers check the underlying material? Tom’s Guide’s May 2026 archive page lists the comparison published on May 17. OpenAI’s supporting material includes its Codex launch post from May 16, 2025, the Codex app announcement from February 2, 2026, and its current Codex product and developer pages. Anthropic’s supporting material includes its Claude Code product page and developer resources describing agent workflows and evaluations. (openai.com) (tomsguide.com)