AI Company Anthropic Accused of Excessive Data Crawling

By:Nathan Published 2024-07-31T00:45:34Z

TapTechNews July 31, the Financial Times (FT) posted a blog stating that although the AI company Anthropic claims to "develop AI responsibly", it excessively crawls website data through the ClaudeBot robot to train the Claude large language model.

Although using web crawlers to crawl data is a common practice in the AI industry, Anthropic has been criticized for its aggressiveness.

The freelancer website Freelancer also said that ClaudeBot accessed 3.5 million times in four hours and was forced to block it. Critics pointed out that Anthropic ignores the robots.txt protocol of the website and forcefully acquires data, which is contrary to its claimed "responsible AI" concept.

AI Company Anthropic Accused of Excessive Data Crawling_0

The CEO of the repair team iFixit, Kyle Wiens, posted a tweet on July 24, and TapTechNews translated it as follows:

@AnthropicAI, I know you are eager to obtain data, and the Claude model is also very smart, but is it really necessary to access our server 1 million times within 24 hours?

These traffics didn't pay us and occupied our development resources, which is really not kind.

In our terms of service, it has clearly prohibited using our content in this way, but how did you quietly @AnthropicAI do it.

If @AnthropicAI wants to communicate about the commercial use license of our content, we are willing to communicate.

Anthropic ClaudeBot data crawling criticism