Anthropic Exposes Chinese AI Labs' Claude Distillation Attacks with 16M+ Exchanges


Anthropic accused DeepSeek, Moonshot AI, and MiniMax of orchestrating large-scale distillation attacks against Claude models. The Chinese labs used 24,000 fake accounts to extract over 16 million exchanges violating terms of service. Operations bypassed regional restrictions through proxy networks and hydra account clusters. 

Distillation transfers capabilities from large “teacher” models to smaller “student” copies cheaply. Legitimate for internal use. Illicit attacks steal frontier AI reasoning, coding, and agentic skills rapidly. Distilled clones lack U.S. safety safeguards against bioweapons or cyber misuse.

DeepSeek ran 150,000+ exchanges targeting reasoning and censorship circumvention. Moonshot AI extracted 3.4 million interactions focused on tool use and computer vision. MiniMax conducted the largest campaign with 13 million exchanges pivoting to new Claude versions within 24 hours.

Proxy services resold API access around China restrictions. Fraudulent accounts mixed distillation traffic with legitimate requests. IP correlations, payment patterns, and researcher metadata confirmed attributions.

Anthropic deployed chain-of-thought classifiers and behavioral fingerprinting for detection. Industry sharing and tightened verification followed immediately.

Attack Campaigns Table

LabExchangesPrimary TargetsTactics
DeepSeek150,000+Reasoning, reward modelsSynchronized accounts, shared payments
Moonshot AI3.4MAgentic coding, visionMulti-path access, reasoning traces
MiniMax13MTool orchestration24hr model pivots, hydra clusters

Distilled Claude copies risk military integration or open-sourcing without safeguards. Capabilities spread uncontrollably once extracted.

Distillation Impact Details

Fake accounts mimicked legitimate researchers and businesses. Commercial proxy services scaled operations globally. China access ban forced sophisticated workarounds.

Anthropic warns of dual-use risks. Unprotected reasoning powers surveillance, cyber weapons, or bioweapon design without constitutional AI limits.

Labs reconstructed Claude’s chain-of-thought patterns systematically. Agentic tool use and computer vision extraction enable autonomous systems immediately.

Detection Methods Deployed

  • Chain-of-thought elicitation classifiers
  • Behavioral fingerprinting across account clusters
  • IP/payment metadata correlation
  • Infrastructure fingerprint matching
  • Industry-wide indicator sharing

Single-lab defense proves insufficient. Anthropic calls for cloud provider coordination and policy support. U.S. chip export controls gain validation through distillation scale.

OpenAI reported similar DeepSeek attacks against ChatGPT previously. Industry-wide threat emerges as standard practice among frontier competitors.

Anthropic invests heavily in proactive systems. Educational/research account verification tightens significantly. No single company solves distillation alone.

FAQ

What is AI model distillation?

Smaller models learn from larger “teacher” outputs. Illicit use steals competitor capabilities cheaply.

How many exchanges did attackers extract?

16M+ total. MiniMax largest at 13M, Moonshot 3.4M, DeepSeek 150K+.

Which Claude capabilities were targeted?

Reasoning, agentic coding, tool use, computer vision, censorship evasion.

How did labs bypass China restrictions?

Commercial proxy services reselling API access through fake account networks.

What safety risks do distilled models create?

Lack U.S. safeguards against bioweapons, cyber ops, military misuse.

Anthropic response actions?

Deployed classifiers, tightened verification, industry sharing.

Readers help support VPNCentral. We may get a commission if you buy through our links. Tooltip Icon

Read our disclosure page to find out how can you help VPNCentral sustain the editorial team Read more

User forum

0 messages