Sign In

Delhi News Daily

  • Home
  • Fashion
  • Business
  • World News
  • Technology
  • Sports
  • Politics
  • Lifestyle
  • Entertainment
Reading: Perplexity Might Be Using Illegitimate Means to Scrape Websites’ Data – Delhi News Daily
Share

Delhi News Daily

Font ResizerAa
Search
Have an existing account? Sign In
Follow US
© 2022 Foxiz News Network. Ruby Design Company. All Rights Reserved.
Delhi News Daily > Blog > Technology > Perplexity Might Be Using Illegitimate Means to Scrape Websites’ Data – Delhi News Daily
Technology

Perplexity Might Be Using Illegitimate Means to Scrape Websites’ Data – Delhi News Daily

delhinewsdaily
Last updated: August 5, 2025 9:01 am
delhinewsdaily
Share
Perplexity Is Using Stealth Bots and Breaking Website Directives to Fetch Data, Says Cloudflare
SHARE


Perplexity is said to be illegitimately accessing content from websites despite being prohibited from doing so. Cloudflare, a global web security services company, conducted a test to confirm the stealth behaviour of the answer engine company. The researchers highlighted that not only were crawler bots from Perplexity ignoring the directives from the websites, but they were also actively hiding their identity via multiple means to ensure website owners could not track the activity. Cloudflare was also able to find a way to successfully shut down the artificial intelligence (AI) company’s efforts.

Cloudflare Catches Perplexity’s Stealth Tactics in Action

In a blog post, the web security platform claimed that Perplexity was involved in “stealth crawling” activities. “We see continued evidence that Perplexity is repeatedly modifying their user agent and changing their source ASNs to hide their crawling activity, as well as ignoring — or sometimes failing even to fetch — robots.txt files,” the post added.

Before delving into Perplexity’s behaviour, it is important to understand how the entire system works. Owners of content websites add information, and third-party services such as search engines fetch this data to index these websites and make them appear when a relevant query is typed. Some apps and websites also scrape websites to either surface them within their interface or collect data with permission.

However, for this relationship between websites and crawlers to work, there must be trust. It is established by these bots following a set of rules when crawling any website. These rules dictate that the activity of bots must be transparent, they should serve a clear purpose and perform only specific activity, and they should follow website directives and preferences. So, if a website blocks a bot, it should not crawl their website.

As per Cloudflare researchers, Perplexity is breaking this trust model by using stealth tactics to scrape website data even from those websites that explicitly block its declared bots — PerplexityBot and Perplexity-User. The researchers were able to confirm this activity by creating new test domains.

These domains were not indexed by any search engine or made publicly accessible or discoverable. Additionally, the researchers implemented a robots.txt file (a text file used by websites to give instructions to web crawlers) to stop all bots from accessing any part of the website.

Then, Cloudflare researchers went to Perplexity and asked it specific questions about these newly created domains. They found that, despite following Internet protocols to prevent crawling activity, Perplexity was still able to surface detailed information about these websites.

Cloudflare claims Perplexity’s user agents or web crawlers take several steps to bypass websites’ directives and access the data. If a declared user agent is denied access via robots.txt, it ignores it and continues to scrape data. If a website has implemented a web application firewall (WAF) to block the bot, the company uses a generic browser agent intended to impersonate Google Chrome or macOS.

This undeclared bot is also said to utilise multiple IPs not listed in Perplexity’s official IP range to trick the website. To further hide its tracks, these crawlers were said to use different automatic system numbers. Notably, Cloudflare stated that when these undeclared bots were successfully stopped, the quality of Perplexity’s responses declined, as it began to rely on other data sources to answer the query.

Cloudflare said its bot management system was able to register all the undeclared crawling activity from Perplexity’s hidden user agents and is now automatically protecting all its bot management customers. Additionally, the company has added signature matches for the stealth crawler to its managed rule, which blocks AI crawling activity. This is available to all Cloudflare users, including those on the free tier.



Source link

Share This Article
Twitter Email Copy Link Print
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article अवैध पेड़ कटाई पर वन विभाग की कार्रवाई NGT को क्यों लगी नाकाफी, जानें क्या है मामला – Delhi News Daily
Next Article UAE: Sharjah confirms new private school calendar for 2025–2026, key changes every parent should know | World News – Times of India – Delhi News Daily
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recent Posts

  • ‘Been there six times’: How NASA dismissed Kim Kardashian’s moon-landing doubts – Watch – The Times of India – Delhi News Daily
  • ‘Betrayed US and its allies’: Australian executive pleads guilty to selling American cyber secrets to Russia | World News – The Times of India – Delhi News Daily
  • Pilgrim travel alert: Saudi Arabia cuts Umrah visa entry validity to one month | World News – The Times of India – Delhi News Daily
  • What was said about Twitch streamer HasanAbi during Michael Knowles’ speech in the US Senate hearing? – The Times of India – Delhi News Daily
  • Did Taylor Swift turn off her comments on TikTok? What really made The Life of a Showgirl spark Nazi and trad wife claims – The Times of India – Delhi News Daily

Recent Comments

No comments to show.

You Might Also Like

OnePlus Bullets Wireless Z3 With 12.4mm Drivers, Up to 36 Hours of Battery Life Launched in India
Technology

OnePlus Bullets Wireless Z3 With Up to 36 Hours Battery Launched in India – Delhi News Daily

OnePlus Bullets Wireless Z3 were launched in India on Thursday. The wireless neckband-style earphones are claimed to offer a battery…

3 Min Read
Samsung Galaxy Z Flip 7 FE Name Appears in Alleged Third-Party Case Listing Alongside Galaxy Z Flip 7
Technology

Samsung Galaxy Z Flip 7 FE Moniker Confirmed in Alleged Case Listing – Delhi News Daily

Samsung will hold a Galaxy Unpacked event next week, where it is expected to launch the next-generation Galaxy Z foldables,…

3 Min Read
Brain’s Built-In Signal Threshold Helps Differentiate Imagination from Reality
Technology

Brain’s Built-In Signal Threshold Helps Differentiate Imagination from Reality – Delhi News Daily

A basic but inscrutable brain system may distinguish between reality and imagination. New research has found that there may be a…

3 Min Read
Battlefield 6 Unveiled With First Look at Campaign, Multiplayer Reveal Event Set for July 31
Technology

Battlefield 6 Trailer Shows Explosive Campaign, Destructible Environments – Delhi News Daily

Battlefield 6, the next entry in Electronic Arts' long-running military shooter franchise, has been revealed. After years of hiatus and…

5 Min Read

Delhi News Daily

© Delhi News Daily Network.

Incognito Web Technologies

Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?