Claude Hit by Massive Distillation Attack

AI safety company Anthropic has disclosed a large scale campaign involving distillation attacks targeting its flagship model Claude. The company published technical findings explaining how coordinated actors attempted to extract advanced capabilities from its system using automated and fraudulent access methods.

What Are Distillation Attacks

Distillation is a legitimate machine learning technique where a smaller model is trained using outputs from a more advanced model. It is commonly used to create faster and cheaper versions of large systems.

However, in this case, Anthropic identified what it calls distillation attacks. These attacks involve generating massive volumes of structured prompts to capture reasoning patterns, coding ability, and task execution behaviors from Claude.

Instead of normal user queries, attackers submitted highly repetitive prompts designed to extract high quality training data. The responses were then reportedly used to improve external models without direct research investment.

Scale of the Activity

According to Anthropic, the campaign involved tens of thousands of coordinated accounts and millions of API interactions. The traffic patterns differed significantly from normal user behavior. Investigators identified clusters of activity tied to shared infrastructure, proxy networks, and automation frameworks.

The requests focused heavily on advanced reasoning, tool usage, and structured outputs, which are particularly valuable for training competitive large language models. These usage patterns allowed Anthropic to detect anomalies through behavioral monitoring systems.

Attributed Distillation Campaigns

Lab	Estimated Scale	Primary Targets	Key Techniques	Attribution Method	Notable Behavior
DeepSeek	Over 150,000 exchanges	Reasoning tasks; rubric based grading; censorship safe query reformulation	Chain of thought extraction prompts; synchronized traffic; shared payment methods	Request metadata traced to researchers	Generated step by step reasoning data; coordinated load balancing to avoid detection
Moonshot AI	Over 3.4 million exchanges	Agentic reasoning; coding; data analysis; computer use agents; computer vision	Hundreds of fraudulent accounts; reasoning trace reconstruction	Metadata matched public staff profiles	Varied account types to evade clustering detection
MiniMax	Over 13 million exchanges	Agentic coding; tool orchestration	Infrastructure coordinated traffic; rapid pivot to new model releases	Metadata and infrastructure indicators	Redirected nearly half of traffic to newly released Claude version within 24 hours

Anthropic stated that one campaign demonstrated synchronized traffic across accounts with identical interaction patterns and coordinated timing, suggesting throughput optimization and evasion tactics. In several cases, prompts explicitly requested Claude to articulate its internal reasoning step by step, effectively generating chain of thought training data at scale.

How the Attacks Were Detected

Anthropic’s security team relied on a combination of metadata analysis, traffic pattern recognition, and infrastructure fingerprinting. Suspicious activity included synchronized account creation, high volume structured prompt repetition, and non human interaction timing patterns.

By analyzing IP correlations, backend telemetry, and infrastructure overlaps, the company was able to attribute campaigns to specific coordinated groups with high confidence.

Security and National Risk Concerns

Anthropic warns that unauthorized distillation can weaken AI safety protections. Models trained through extraction may replicate advanced capabilities without inheriting safety guardrails. This increases risks in areas such as cyber offense automation, large scale misinformation, and sensitive research assistance.

The company also highlighted concerns related to export controls. If foreign entities can replicate frontier AI capabilities through extraction, regulatory restrictions on advanced hardware and AI systems may lose effectiveness.

Defensive Measures Going Forward

To prevent further distillation attacks, Anthropic is implementing stronger account verification systems and advanced behavioral classifiers to detect automated extraction patterns. The company is also collaborating with industry partners to share threat intelligence indicators and coordinate response strategies.

Anthropic emphasized that defending frontier AI systems requires coordinated action between AI labs, cloud providers, and policymakers. As AI capabilities continue to grow, protecting model integrity has become a core cybersecurity priority.

Distillation attacks are now emerging as a new category of AI security threat, signaling that model extraction and intellectual property protection will be major challenges in the next phase of AI development.

We’ve identified industrial-scale distillation attacks on our models by DeepSeek, Moonshot AI, and MiniMax.

These labs created over 24,000 fraudulent accounts and generated over 16 million exchanges with Claude, extracting its capabilities to train and improve their own models.
— Anthropic (@AnthropicAI) February 23, 2026

Connect with Us

Claude Hit by Massive Distillation Attack

What Are Distillation Attacks

Scale of the Activity

Attributed Distillation Campaigns

How the Attacks Were Detected

Security and National Risk Concerns

Defensive Measures Going Forward

Related

Editorial Team

MuddyWater GhostFetch CHAR MENA Targets

Malicious npm Package Hides Pulsar RAT in PNG Images

No Comment! Be the first one.

Leave a Reply Cancel reply

Recent Posts

108 Malicious Chrome Extensions Steal Telegram & Google Data

Basic-Fit Data Breach Exposes 1 Million Members Across Multiple Countries

Adobe Patches Critical Acrobat Flaw Exploited as Zero-Day

Technical Details Revealed for Critical Cisco Smart Software Manager RCE Flaw

Critical Chrome Vulnerabilities Allow Arbitrary Code Execution

Social Engineering Attack Targets Open Source Developers via Slack

Project Glasswing boosts AI driven cybersecurity defense

CVE-2025-32975 KACE SMA Exploitation Alert

Axios npm Package Compromised in Supply Chain Attack Using Malicious Dependency Injection

EvilTokens Phishing Kit Exploits Microsoft Device Login to Hijack Accounts at Scale

You Might Also Like

Project Glasswing boosts AI driven cybersecurity defense

Axios npm Package Compromised in Supply Chain Attack Using Malicious Dependency Injection

EvilTokens Phishing Kit Exploits Microsoft Device Login to Hijack Accounts at Scale

Kimwolf v7 Botnet Emerges with HTTP/2 DDoS and Blockchain-Based C2 Evasion

Cybersecurity

Incident Response Series 1: Cyber Incident Essentials

Discord Malware Uses Clipboard Hijacking for Crypto Theft

Informative Read

VidLeaks Exposes Privacy Risks in Text-to-Video AI Models

OpenRAG-Soc Benchmarks Indirect Prompt Injection in RAG Systems

Categories

Connect with Us

Claude Hit by Massive Distillation Attack

What Are Distillation Attacks

Scale of the Activity

Attributed Distillation Campaigns

How the Attacks Were Detected

Security and National Risk Concerns

Defensive Measures Going Forward

Related

Share Article

MuddyWater GhostFetch CHAR MENA Targets

Malicious npm Package Hides Pulsar RAT in PNG Images

No Comment! Be the first one.

Leave a Reply Cancel reply

You Might Also Like

Cybersecurity

Informative Read

Categories