Cloudflare Blocks Perplexity AI for Stealth Crawling Tactics

Cloudflare Blocks Perplexity AI for Stealth Crawling Tactics

Cloudflare Responds to Crawling Violations

Cloudflare has officially removed Perplexity AI from its Verified Bots list and begun blocking its crawlers across websites using Cloudflare’s services. This extreme step was taken after it was found that Perplexity’s bots were reportedly ignoring robots.txt rules and employing dishonest methods to scrape content from websites.

 

Stealth Crawling Tactics Trigger Block

Violation of Crawling Protocols

According to Cloudflare, Perplexity’s crawlers did not follow standard rules for ethical crawling. Even if robots.txt blocked them, they were still able to get to forbidden content by:

  • Rotating IP addresses
  • Spoofing user-agent headers (pretending to be a browser like Chrome)
  • Using unverified or misleading ASN sources

These methods made it hard for site owners and systems to figure out who the crawler really was.

 

Honeypot Traps Expose Behavior

To confirm its concerns, Cloudflare set up “honeypot” sites with clear instructions in the robots.txt file that told search engines not to crawl them. The computers at Perplexity were still able to get to the content, which meant they were ignoring crawl directions.

 

Removal from the List of Verified Bots

Cloudflare maintains a list of trusted and transparent bots that follow ethical web crawling practices. Due to the repeated violations, Perplexity was delisted from this program and is no longer recognized as a compliant bot. Because of this, millions of Cloudflare-protected domains now automatically prevent its crawling attempts.

 

Cloudflare’s Statement on the Decision

Cloudflare said that it encourages AI and new ideas, but it wants all bots to follow web standards and the wishes of website owners.

“We support new ideas, but we won’t put up with behavior that breaks trust and rules.” Cloudflare said in its official response, “Our goal is to keep the web safe, open, and fair for everyone.”

 

Perplexity AI’s Position Remains Unclear

As of the time of writing, Perplexity has not officially responded to the claims or the block that Cloudflare put in place. There are still questions regarding whether these stealth approaches were planned, just one-time events, or part of a bigger plan to collect data.

 

Impact on the Web and AI Ecosystem

This move by Cloudflare sends a clear message: AI companies must adhere to web crawling standards if they want access to public content. The incident may also prompt more companies to review how AI tools interact with their websites and data.

 

Key Takeaway

Website owners should regularly review their server logs and bot activity. Robots.txt, IP filtering, and bot verification are some of the tools that can help keep unauthorized or covert crawling at bay.

Leave a Reply

Your email address will not be published. Required fields are marked *