Anthropic ran internal agent marketplace — 69 agents completed about 186 deals

- Anthropic said on April 24 that its weeklong “Project Deal” let 69 Claude agents trade real goods for employees, closing 186 deals worth just over $4,000. - The sharpest finding was hidden: Claude Opus 4.5 consistently got better outcomes than Claude Haiku 4.5, and weaker-model users usually didn’t notice. - That matters because agent commerce looks technically viable now, but fairness, permissions, and dispute rules still look underbuilt.

Anthropic just showed what an AI-to-AI marketplace actually looks like when you let it touch real stuff and real money. Not a demo. Not a toy benchmark. A weeklong internal market where Claude agents bargained over snowboards, ping-pong balls, dog-sitting time, and other physical items for 69 employees. The big takeaway is simple — the agents mostly worked. The more unsettling takeaway is that better models quietly won more value for their users, and the people on the losing side often couldn’t tell. (anthropic.com) ### What was Project Deal? Project Deal was Anthropic’s internal classifieds market, run in its San Francisco office and disclosed on April 24, 2026. Employees told Claude what they might want to buy or sell, got a $100 budget to spend, and then the agents handled the posting, bargaining, and dealmaking for a week. Afterward, the humans actually exchanged the goods their agents had negotiated over. (anthropic.com) ### What did the agents actually do? They did the boring middle of commerce — listing items, replying to offers, countering, remembering preferences, and trying to close. Across more than 500 listed items, the 69 agents completed 186 deals with total transaction value of just over $4,000. That matters because this is exactly the kind of low-stakes workflow people keep saying agents will absorb(anthropic.com)ng. (anthropic.com) ### Why is this more than a gimmick? Because the goods were real and the autonomy was real. Anthropic wasn’t asking whether a model can role-play a negotiation in a chat window. It was asking whether a model can represent a human’s interests well enough that the human is still happy once money and objects change hands. The answer, at least here, was mostly yes — enough that 46% of participants said they’d pay for a similar service in the future. (anthropic.com) ### Where did the hidden inequality show up? Anthropic also ran a secret parallel test using different Claude models. Some people got the stronger Claude Opus 4.5. Others got the smaller Claude Haiku 4.5. Same basic task, same kind of market, but different negotiating brains. The stronger model got objectively better outcomes, while users represented by weaker models generally didn’t realize t(anthropic.com)ople. (anthropic.com) ### Why is that such a big deal? Because it suggests agent markets could create invisible quality tiers. If your agent is better at reading context, timing offers, or pressing for concessions, you may get better prices without either side noticing why. It’s a little like showing up to every flea market with a sharper negotiator in your ear — except eventually both negotiators are software, and the human can’t easily audit what happened. (anthropic.com) ### Did Anthropic say this is a product? No — this was framed as an experiment, not a launch. But it fits a broader pattern at Anthropic: giving Claude more room to act in the world, then studying where autonomy helps and where it breaks. Project Deal follows earlier work like Project Vend, where Claude ran a small office business, and it lines up with Anthropic’s recent push around “trustworthy agents.” (anthropic.com) ### So what’s missing before this goes mainstream? The plumbing. Permissions. Spending limits. Identity checks. Dispute resolution. Audit trails. If agents start buying and selling for people at scale, the hard part won’t just be getting them to negotiate. It’ll be proving whose instructions they followed, who is liable when they mess up, and whether one class of users is quietly getting outplayed by better models. (anthropic.com) ### Bottom line? Project Deal makes agent commerce feel less hypothetical. The agents were good enough to move real inventory and leave many users satisfied. But the experiment also showed the first real shape of the problem — once agents represent us in markets, model quality stops being just a product feature and starts looking a lot like bargaining power. (anthropic.com)

Get your own daily briefing

Scout delivers personalized news, insights, and conversations tailored to your role and industry.

Download on the App Store

Shared from Scout - Be the smartest in the room.