Claude AI Agents Closed 186 Deals in Anthropic Marketplace Experiment


Anthropic says Claude AI agents successfully negotiated and closed 186 real-world deals in an internal marketplace experiment that tested how AI systems might buy and sell goods on behalf of humans. The experiment, called Project Deal, involved 69 Anthropic employees and more than 500 listed items.

The agents handled the full negotiation process inside Slack, including listings, offers, counteroffers, and final agreements. Anthropic said the deals reached just over $4,000 in total transaction value, with employees later exchanging the real physical goods.

The experiment also raised a larger concern for future AI-driven commerce. Anthropic found that employees represented by its stronger Claude Opus 4.5 model received better financial outcomes than those represented by Claude Haiku 4.5, even though many users did not notice the difference.

Anthropic tested agent-to-agent commerce with real goods

Project Deal ran for one week in December 2025 at Anthropic’s San Francisco office. The company designed it like a classified marketplace where AI agents represented both buyers and sellers.

Each participant first completed an interview with Claude. During that intake process, employees described items they wanted to sell, things they wanted to buy, expected prices, negotiation preferences, and any instructions for their agent.

Anthropic then converted those answers into custom system prompts. After that, the agents entered Slack channels and negotiated without human approval during the bargaining process.

How Project Deal worked

StepWhat happened
Employee interviewClaude asked participants what they wanted to buy or sell
Agent setupAnthropic created a custom Claude agent for each person
Marketplace launchAgents posted listings and offers in Slack
NegotiationAgents made offers, counteroffers, and final deals
Deal executionHumans later exchanged the physical items
Parallel testAnthropic compared Opus 4.5 and Haiku 4.5 agents

The listed items ranged from everyday office clutter to personal goods. Anthropic mentioned examples such as a snowboard, a folding bike, lab-grown rubies, artwork, dog-sitting experiences, and a plastic bag containing 19 ping-pong balls.

The company said the experiment used four separate marketplace runs. Two runs used only Claude Opus 4.5, while two mixed Opus 4.5 and Haiku 4.5 agents to test whether model quality changed negotiation results.

The “real” run, where employees later exchanged goods, used Opus agents. Anthropic kept some details hidden from participants until after the post-experiment survey to measure how people perceived agent performance.

Stronger Claude agents won better deals

Anthropic found that agent quality affected marketplace outcomes. In the mixed-model runs, Opus users completed about two more deals on average than Haiku users.

The stronger model also negotiated better prices. Anthropic said Opus sellers earned $2.68 more on average for the same item, while Opus buyers paid $2.45 less on average.

The difference became clearer in individual examples. Anthropic said the same broken folding bike sold for $65 when represented by Opus but only $38 when represented by Haiku. A lab-grown ruby also sold for $65 under Opus and $35 under Haiku.

Users did not clearly detect the disadvantage

Anthropic said participants generally rated the deals as fair. On a seven-point fairness scale, deal ratings stayed close to the midpoint, with Opus deals averaging 4.05 and Haiku deals averaging 4.06.

That result matters because weaker-model users often received worse outcomes without clearly recognizing the disadvantage. Anthropic said this could create a quiet form of inequality if agent-driven marketplaces become common.

The finding suggests that future online markets may not only depend on what people want to buy or sell. They may also depend on which AI agent represents them and how capable that agent is during negotiation.

Prompting mattered less than model quality

Anthropic also tested whether user instructions changed negotiation outcomes. Some employees asked their agents to behave politely, while others told them to negotiate aggressively.

The company found that aggressive prompting did not have a statistically significant impact on whether items sold, how much sellers earned, or how much buyers paid. Anthropic said model quality mattered more than negotiation style in this pilot.

That finding could shape how businesses design AI shopping and sales agents. Better models may deliver stronger results than simple prompt tweaks, especially when agents must reason across preferences, prices, context, and tradeoffs.

Why the experiment matters

Project Deal shows that AI agents can handle real commercial negotiation, not only simulated tasks. They identified possible matches, wrote natural language offers, negotiated prices, and completed transactions without live human intervention.

The experiment also points to risks that regulators, platforms, and companies may need to address. If one side has a stronger AI agent, it may win better terms while the other side still feels the deal was fair.

That could affect consumer marketplaces, procurement, travel booking, hiring, real estate, software sales, and other areas where AI agents may soon negotiate for humans.

Key findings from Project Deal

  • Anthropic recruited 69 employees for the experiment.
  • Claude agents handled more than 500 listed items.
  • The agents closed 186 deals.
  • Total transaction value passed $4,000.
  • Opus agents completed about two more deals on average than Haiku agents.
  • Opus sellers earned $2.68 more per item on average.
  • Opus buyers paid $2.45 less per item on average.
  • Participants did not clearly detect weaker-model disadvantages.
  • Anthropic described the project as a pilot experiment, not a product launch.

Summary

  1. Anthropic tested a Claude-run internal marketplace called Project Deal.
  2. Claude agents negotiated and closed 186 real deals worth more than $4,000.
  3. The experiment involved 69 employees and more than 500 listed items.
  4. Claude Opus 4.5 agents outperformed Claude Haiku 4.5 agents in mixed-market tests.
  5. The results show both the promise and fairness risks of AI-mediated commerce.

FAQ

What is Anthropic’s Project Deal?

Project Deal is an internal Anthropic experiment where Claude AI agents negotiated real marketplace transactions for employees.

How many deals did Claude agents close?

Anthropic said the agents closed 186 deals with a total transaction value of just over $4,000.

Did humans approve each deal during negotiation?

No. Once the experiment began, agents negotiated and closed deals without asking humans for approval during the process.

Which Claude models were tested?

Anthropic tested Claude Opus 4.5 and Claude Haiku 4.5 in parallel marketplace runs.

Readers help support VPNCentral. We may get a commission if you buy through our links. Tooltip Icon

Read our disclosure page to find out how can you help VPNCentral sustain the editorial team Read more

User forum

0 messages