Anthropic claims others are stealing its Claude AI.

8 Min Read

An AI firm alleges that DeepSeek, Moonshot, and MiniMax utilized deceptive accounts and proxy services to extensively extract the capabilities of its Claude model, even as industry specialists note that the AI sector frequently depends on publicly accessible data.

AI
Credit: gguy – Shutterstock.com

Anthropic has leveled accusations against three Chinese AI developers, claiming they orchestrated extensive operations to unlawfully extract functionalities from its Claude model to enhance their own AI systems. The company contends that DeepSeek, Moonshot, and MiniMax employed a “distillation” method, which involves training a less sophisticated model using the outputs generated by a more advanced one.

These activities reportedly led to over 16 million interactions with Claude, facilitated by approximately 24,000 deceptive accounts, thereby breaching Anthropic’s service terms and geographical access restrictions.

Anthropic stated that it does not provide commercial access to Claude in China, nor to the foreign subsidiaries of these accused companies.

How Claude’s Advanced Functions Were Extracted Systematically

Anthropic detailed that the three campaigns aimed at distillation followed a similar strategy: using fake accounts and proxy networks to gain large-scale access to Claude while avoiding detection. Their primary targets were Claude’s agentic reasoning, tool utilization, and coding abilities.

The DeepSeek operation involved more than 150,000 interactions, primarily focused on extracting reasoning abilities across a variety of tasks. This activity exhibited synchronized traffic across accounts, marked by identical patterns, shared payment methods, and coordinated timing, suggesting an effort to balance load, boost reliability, and elude detection.

Moonshot AI’s activities encompassed over 3.4 million interactions, zeroing in on agentic reasoning and tool use, coding and data analysis, the development of computer-use agents, and computer vision to reconstruct Claude’s reasoning processes. MiniMax’s effort was the most extensive, involving more than 13 million interactions, and explicitly targeted agentic coding, tool use, and orchestration. Anthropic stated that upon detecting this active campaign, MiniMax swiftly diverted nearly half of its traffic to Claude’s recently launched model within 24 hours.

Anthropic indicated that, to execute these campaigns, the companies relied on commercial proxy services, which are known to resell large-scale access to Claude and other leading AI models, often referred to as hydra cluster architectures.

Revisiting the Fundamentals of AI Model Training

Industry specialists highlight that these allegations bring forth a broader, yet unresolved, issue concerning the training methods for AI systems. The majority of large language models, including prominent commercial ones, are themselves trained using immense volumes of publicly available internet data, often without explicit consent from the original content creators.

“Similar to how many foundational models were constructed by indexing the vastness of the internet, frequently without the explicit permission of creators or by leveraging content from other search engines, newer participants are often adopting comparable methods of distillation and optimization,” commented Neil Shah, vice president at Counterpoint Research. He further noted a fundamental disagreement, largely legally undefined, regarding the ownership of synthetic data and the permissibility of its use for training, particularly for open models.

Export Restrictions and National Security Concerns

Anthropic has partly framed these alleged distillation operations through a national security perspective, asserting that illegally replicated models could undermine U.S. efforts to manage the dissemination of advanced AI capabilities, especially if influenced by the Chinese Communist Party. However, experts point out that current U.S. export controls primarily target hardware, rather than large language models themselves.

“It’s crucial to distinguish between hardware limitations and service access. U.S. export regulations have largely focused on advanced semiconductors, high-performance computing infrastructure, and, at specific regulatory junctures, certain categories of advanced AI model weights. There is no blanket prohibition on offering API access to large language models within China,” explained Sanchit Vir Gogia, CEO and chief analyst at Greyhound Research.

Nevertheless, this does not grant immunity to developers. Gogia further noted that the Bureau of Industry and Security continues to refine its licensing frameworks for advanced computing products and high-capacity systems. Furthermore, if a company knowingly supports training activities for restricted entities, particularly those connected to military or strategic objectives, potential liability could arise even without any physical hardware shipments.

To protect themselves, many U.S. AI providers already limit access in China through internal business policies and compliance stances, often exceeding what is legally mandated.

“For developers, the danger is indirect yet tangible: if your product facilitates access to restricted territories or entities, enables prohibited uses, or assists others in circumventing provider geo-restrictions, you risk account termination, contractual penalties, and potentially regulatory scrutiny, depending on the end-user and the system’s capabilities,” stated a global partner/senior managing director – India at Ankura Consulting.

Repercussions for Teams Developing with Large Language Models

For developers who create or train models using large language models, Anthropic’s accusations underscore an increasingly ambiguous territory. While developers commonly leverage LLM APIs for creating applications, testing, or evaluation, providers are now closely examining the large-scale, automated use of model outputs for training rival systems.

For example, Anthropic is responding by investing in protective measures. To detect such activities, the company has developed several classifiers and behavioral fingerprinting systems designed to identify patterns indicative of distillation attacks in API traffic. It has also tightened verification processes for educational accounts, security research programs, and startup organizations, citing these as the most frequent avenues for establishing fraudulent accounts. Additionally, the company is implementing product, API, and model-level safeguards intended to diminish the effectiveness of model outputs for illicit distillation, without degrading the user experience for legitimate clients.

Developers, too, must ensure their model training practices remain secure, compliant, and defensible.

Jaju advised that, as a starting point, developers should review API/service terms and assume that training on outputs is prohibited unless explicitly stated otherwise. They should maintain precise records detailing the origin of every training/example item, complete with licensing and terms. Operational logs should be kept separate from training datasets, and retention limits should be established for both.

“Geopolitical awareness cannot be an afterthought. Due diligence regarding restricted parties, adherence to export regulations, and region-specific access controls are increasingly integral to AI governance, particularly for multinational enterprises,” Gogia emphasized.

Experts suggest that if questioned by a regulatory body or a potential acquirer about their training pipeline, developers should be ready to provide comprehensive documentation without any reservations.

Artificial IntelligenceData EngineeringAnalytics
Share This Article
Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *