Some People Are Defending Perplexity After Cloudflare ‘named And Shamed’ It

2 hours ago

When Cloudflare accused AI hunt motor Perplexity of stealthily scraping websites connected Monday, while ignoring a site’s circumstantial methods to artifact it, this wasn’t a clear-cut lawsuit of an AI web crawler gone wild.

Many group came to Perplexity’s defense. They based on that Perplexity accessing sites successful defiance of nan website owner’s wishes, while controversial, is acceptable. And this is simply a contention that will surely turn arsenic AI agents flood nan internet: Should an supplier accessing a website connected behalf of its personification beryllium treated for illustration a bot? Or for illustration a quality making nan aforesaid request?

Cloudflare is known for providing anti-bot crawling and different web information services to millions of websites. Essentially, Cloudflare’s trial lawsuit progressive mounting up a caller website pinch a caller domain that had ne'er been crawled by immoderate bot, mounting up a robots.txt record that specifically blocked Perplexity’s known AI crawling bots, and past asking Perplexity astir nan website’s content. And Perplexity answered nan question.

Cloudflare researchers recovered nan AI hunt motor utilized “a generic browser intended to impersonate Google Chrome connected macOS” erstwhile its web crawler itself was blocked. Cloudflare CEO Matthew Prince posted the investigation connected X, writing, “Some supposedly ‘reputable’ AI companies enactment much for illustration North Korean hackers. Time to name, shame, and difficult artifact them.”

But galore group disagreed pinch Prince’s appraisal that this was existent bad behavior. Those defending Perplexity connected sites like X and Hacker News pointed retired that what Cloudflare seemed to archive was nan AI accessing a circumstantial nationalist website erstwhile its personification asked astir that circumstantial website.

“If I arsenic a quality petition a website, past I should beryllium shown nan content,” 1 personification connected Hacker News wrote, adding, “why would nan LLM accessing nan website connected my behalf beryllium successful a different ineligible class arsenic my Firefox web browser?”

A Perplexity spokesperson previously denied to TechCrunch that nan bots were nan company’s and called Cloudflare’s blog station a income transportation for Cloudflare. Then connected Tuesday, Perplexity published a blog successful its defense (and mostly attacking Cloudflare), claiming nan behaviour was from a third-party work it uses occasionally.

Techcrunch event

San Francisco | October 27-29, 2025

But nan crux of Perplexity’s station made a akin entreaty arsenic its online defenders did.

“The quality betwixt automated crawling and user-driven fetching isn’t conscionable method — it’s astir who gets to entree accusation connected nan unfastened web,” nan station said. “This contention reveals that Cloudflare’s systems are fundamentally inadequate for distinguishing betwixt morganatic AI assistants and existent threats.”

Peplexity’s accusations aren’t precisely fair, either. One statement that Prince and Cloudflare utilized for calling retired Perplexity’s methods was that OpenAI doesn’t behave successful nan aforesaid way.

“OpenAI is an illustration of a starring AI institution that follows these champion practices. They respect robots.txt and do not effort to evade either a robots.txt directive aliases a web level block. And ChatGPT Agent is signing http requests utilizing nan recently projected unfastened modular Web Bot Auth,” Prince wrote successful his post.

Web Bot Auth is simply a Cloudflare-supported modular being developed by nan Internet Engineering Task Force that hopes to create a cryptographic method for identifying AI supplier web requests.

The statement comes arsenic bot activity reshapes nan internet. As TechCrunch has antecedently reported, bots seeking to scrape monolithic amounts of contented to train AI models have go a menace, particularly to smaller sites.

For nan first clip successful nan internet’s history, bot activity is presently outstripping quality activity online, pinch AI postulation accounting for complete 50%, according to Imperva’s Bad Bot study released past month. Most of that activity is coming from LLMs. But nan study besides recovered that malicious bots now dress up 37% of each net traffic. That’s activity that includes everything from persistent scraping to unauthorized login attempts.

Until LLMs, nan net mostly accepted that websites could and should artifact astir bot activity fixed really often it was malicious by utilizing CAPTCHAs and different services (such arsenic Cloudflare). Websites besides had a clear inducement to activity pinch circumstantial bully actors, specified arsenic Googlebot, guiding it connected what not to scale done robots.txt. Google indexed nan internet, which sent postulation to sites.

Now, LLMs are eating an expanding magnitude of that traffic. Gartner predicts that hunt motor volume will driblet by 25% by 2026. Right now humans thin to click website links from LLMs astatine nan constituent they are astir valuable to nan website, which is erstwhile they are fresh to behaviour a transaction.

But if humans adopt agents arsenic nan tech manufacture predicts they will — to put our travel, book our meal reservations, and shop for america — would websites wounded their business interests by blocking them? The statement connected X captured nan dilemma perfectly:

“I WANT perplexity to sojourn immoderate nationalist contented connected my behalf erstwhile I springiness it a request/task!” wrote one person successful consequence to Cloudflare calling Perplexity out. “What if nan tract owners don’t want it? they conscionable want you [to] straight sojourn nan home, spot their stuff” argued another, pointing retired that nan tract proprietor who created nan contented wants nan postulation and imaginable advertisement revenue, not to fto Perplexity return it.

“This is why I can’t spot ‘agentic browsing’ really moving — overmuch harder problem than group think. Most website owners will conscionable block,” a third predicted.