Anthropic pilots agent marketplace

- Anthropic said April 24 it ran “Project Deal,” a one-week internal marketplace where Claude agents negotiated purchases and sales for 69 San Francisco employees. - The agents completed 186 deals worth just over $4,000, and Anthropic found Claude Opus 4.5 secured better outcomes than Haiku 4.5. - Anthropic also put memory for Claude Managed Agents into public beta on April 23. (anthropic.com)

Anthropic said on April 24 that it ran a one-week internal marketplace where Claude agents bought, sold and negotiated on behalf of 69 employees. (anthropic.com) The company described the test, called Project Deal, as a Craigslist-style market inside its San Francisco office, with each participant’s agent given $100 to spend. Employees first told Claude what they might want to buy or sell, then the agents posted listings and haggled with each other. (anthropic.com) At the end of the week, the agents had completed 186 deals worth just over $4,000, covering physical items from a snowboard to a bag of 19 ping-pong balls. The goods were then actually exchanged by the employees whose agents made the deals. (anthropic.com) Anthropic also ran a parallel version of the experiment in secret to test whether model quality changed results. People represented by Claude Opus 4.5 got objectively better outcomes than people represented by Claude Haiku 4.5, and Anthropic said the weaker-model group did not notice the gap in post-experiment surveys. (anthropic.com) The test was not a public product launch. Anthropic called it a pilot with a self-selected participant pool, but framed it as an early look at what happens when software agents negotiate with each other instead of only answering prompts for humans. (anthropic.com) A day earlier, on April 23, Anthropic added built-in memory for Claude Managed Agents in public beta. The feature lets developers store memories as files that agents can export, edit through the application programming interface, audit and roll back. (claude.com) Anthropic said the memory system is designed for long-running agents that improve across sessions and can share what they learn with other agents under scoped permissions. The company said Netflix uses it to carry context and human corrections across sessions, while Rakuten cut first-pass errors by 97% and Wisedocs sped document verification by 30%. (claude.com) Taken together, the two announcements show Anthropic pushing beyond chatbots toward agents that remember past work and complete real transactions. Project Deal’s result was narrower: better models extracted better terms, even when the people they represented did not realize it. (anthropic.com) (claude.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.