Anthropic Launches Claude 4 Agent

Anthropic launched its Claude 4 model family, which now powers GitHub Copilot and features deep VS Code integration. The Opus 4 and Sonnet 4 models can handle complex tasks, execute code, and maintain memory across sessions, positioning the tool as an agentic 'teammate' for developers. The platform also includes an "Agent Teams" feature for orchestrating multi-step tasks, a concept compared to the viral agent framework OpenClaw.

- On the SWE-bench Verified benchmark, which measures performance on real-world software engineering tasks, Claude Opus 4 achieved a 72.5% pass rate, while Sonnet 4 scored 72.7%. For agentic command-line tasks measured by Terminal-bench, Opus 4 scored 43.2%. - The "Agent Teams" feature operates on a lead agent and sub-agent architecture, where a primary orchestrator agent breaks a task into parallel sub-tasks for multiple independent agents to work on simultaneously. These agents coordinate and share state via a shared filesystem, not a shared context window, and can communicate directly with each other. - GitHub will use Claude Sonnet 4 to power the new coding agent in all its paid Copilot plans, while the more powerful Claude Opus 4 will be available to Copilot Enterprise and Pro+ subscribers. This is part of a broader strategy by GitHub to offer a choice of models from different providers, including OpenAI and Google. - The comparison to OpenClaw is significant because OpenClaw is a viral, open-source framework that allows a language model to write and execute its own code to perform real-world tasks, rather than relying on pre-defined integrations. It runs locally and connects to messaging apps like Slack and Telegram, giving the user a persistent, proactive agent. - When granted filesystem access, Claude 4 can create and reference persistent memory files to track key details across sessions. In one experiment, a model used this capability to play the video game Pokémon Red by generating its own navigation guide and referring back to it during the game. - Pricing for the API remains consistent with the previous generation: Opus 4 costs $15 per million input tokens and $75 per million output tokens, while Sonnet 4 is priced at $3 per million input and $15 per million output tokens. The models are available via the

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.