Cloudflare Blocks Perplexity AI for Stealth Crawling Tactics
Cloudflare Responds to Crawling Violations
Cloudflare has officially removed Perplexity AI from its Verified Bots list and begun blocking its crawlers across websites using Cloudflare’s services. This extreme step was taken after it was found that Perplexity’s bots were reportedly ignoring robots.txt rules and employing dishonest methods to scrape content from websites.
Stealth Crawling Tactics Trigger Block
Violation of Crawling Protocols
According to Cloudflare, Perplexity’s crawlers did not follow standard rules for ethical crawling. Even if robots.txt blocked them, they were still able to get to forbidden content by:
- Rotating IP addresses
- Spoofing user-agent headers (pretending to be a browser like Chrome)
- Using unverified or misleading ASN sources
These methods made it hard for site owners and systems to figure out who the crawler really was.
Honeypot Traps Expose Behavior
To confirm its concerns, Cloudflare set up “honeypot” sites with clear instructions in the robots.txt file that told search engines not to crawl them. The computers at Perplexity were still able to get to the content, which meant they were ignoring crawl directions.
Removal from the List of Verified Bots
Cloudflare maintains a list of trusted and transparent bots that follow ethical web crawling practices. Due to the repeated violations, Perplexity was delisted from this program and is no longer recognized as a compliant bot. Because of this, millions of Cloudflare-protected domains now automatically prevent its crawling attempts.
Cloudflare’s Statement on the Decision
Cloudflare said that it encourages AI and new ideas, but it wants all bots to follow web standards and the wishes of website owners.
“We support new ideas, but we won’t put up with behavior that breaks trust and rules.” Cloudflare said in its official response, “Our goal is to keep the web safe, open, and fair for everyone.”
Perplexity AI’s Position Remains Unclear
As of the time of writing, Perplexity has not officially responded to the claims or the block that Cloudflare put in place. There are still questions regarding whether these stealth approaches were planned, just one-time events, or part of a bigger plan to collect data.
Impact on the Web and AI Ecosystem
This move by Cloudflare sends a clear message: AI companies must adhere to web crawling standards if they want access to public content. The incident may also prompt more companies to review how AI tools interact with their websites and data.
Key Takeaway
Website owners should regularly review their server logs and bot activity. Robots.txt, IP filtering, and bot verification are some of the tools that can help keep unauthorized or covert crawling at bay.