By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
Viral Trending contentViral Trending content
  • Home
  • World News
  • Politics
  • Sports
  • Celebrity
  • Business
  • Crypto
  • Gaming News
  • Tech News
  • Travel
Reading: Opting out: How to stop AI companies from using your online content to train their models
Notification Show More
Viral Trending contentViral Trending content
  • Home
  • Categories
    • World News
    • Politics
    • Sports
    • Celebrity
    • Business
    • Crypto
    • Tech News
    • Gaming News
    • Travel
  • Bookmarks
© 2024 All Rights reserved | Powered by Viraltrendingcontent
Viral Trending content > Blog > World News > Opting out: How to stop AI companies from using your online content to train their models
World News

Opting out: How to stop AI companies from using your online content to train their models

By Viral Trending Content 6 Min Read
Share
SHARE

A US company created a button for website owners to block AI crawlers. Here’s a look at how to block AI from websites and social media.

Contents
Ways of blocking AI crawlersIndustry-standard in the works

We have ad block and now there’s an artificial intelligence (AI) block. 

US cybersecurity company Cloudflare has created a button for website customers to block their data from being used by AI crawlers: Internet bots that roam the web to collect training data.  

“We helped people protect against the scraping of their websites by bots (…) so I really think AI is the new iteration of content owners wanting to control how their content is used,” John Graham-Cumming, the company’s chief technical officer, told Euronews Next in an interview. 

When a connection comes to a website hosted by Cloudflare, they are able to see who is requesting to see the website, including any AI crawlers that identify themselves. The blocker will respond by showing them an error.

Some AI bots pretend to be human users when accessing the website, so Cloudflare built a machine learning model that scores how likely a website request is coming from a human or robot user, Graham-Cumming said. 

The CTO couldn’t say which clients are using the new button but said it’s been “very popular,” with a wide variety of small and large companies. 

Blocking AI crawlers in general is becoming more popular, according to one study from the Data Provenance Initiative, a group of independent AI researchers. 

Their recent analysis of over 14,000 web domains found that five per cent of all data assembled into the Internet’s public databases of C4, RefinedWeb, and Dolma is now restricted. But researchers note this number goes up to 25 per cent when looking at the highest quality sources. 

Ways of blocking AI crawlers

There are ways to manually block AI crawlers from accessing your content. 

Raptive, a US company advocating for creators, wrote in a guide that website hosts could manually add commands to robots.txt, the file that tells search engines who can access your site. 

To do it, you would add the user-agent as the name of popular AI companies, such as Anthropic, and then add “disallow” with a colon and a forward dash. 

Then, the website host would clear the cache and add /robots.txt at the end of the website’s domain in the search bar. 

“Adding an entry to your site’s robots.txt file (…) is the industry-standard method for declaring which crawlers you permit to access your site,” Raptive says in their guide.

There are some AI, content companies, and social media platforms that also allow a block. 

Before its planned June launch, Meta AI gave users a chance to opt out of a new policy where public posts would be used to train their AI models. The company then committed to the European Commission in June that they will not use user data for “undefined artificial intelligence techniques”. 

In 2023, OpenAI published strings of code for website users to block three types of bots from websites: the OAI-SearchBot, ChatGPT-User and GPTBot. 

OpenAI is also working on Media Manager, a tool that will let creators better control what content is being used to train generative AI. 

“This will (be) (…) the first-ever tool of its kind to help us identify copyrighted text, images, audio and video across multiple sources and reflect creator preferences,” OpenAI said in a May blog post. 

Some websites, like Squarespace and Substack, have easy commands or toggles to turn off AI crawling. Others, like Tumblrand WordPress, have “prevent third-party sharing” options that you can turn on to avoid AI training. 

Users can opt out of AI scraping with Slack by sending their support team an email. 

Industry-standard in the works

Websites are able to identify AI crawlers because of a longstanding Internet regulation called the Robots Exclusion Protocol. 

Martijn Koster, a Dutch software engineer, created the protocol in 1994 to limit crawlers overwhelming his own site. It was later adopted by search engines to “help manage their server resources,” according to a blog post from Google Search Central, a site for developers. 

However, it’s not an official Internet standard, which means developers “interpreted the protocol somewhat differently over the years,” according to Google. 

One recent example is Perplexity, a US AI company that runs chatbots, which is being investigated by Amazon overtaking online news content without approval to train its bots. 

“We don’t have an industry agreement for how that applies in the world of AI,” Graham-Cumming from Cloudflare said. “The good (companies) respect the protocol but they don’t actually have to.” 

“We need something across the internet … that makes it very clear that yes or no you may scrape this website for data.” 

The Internet Architecture Board (IAB) is hosting two-day workshops in September, where Graham-Cunning believes an industry standard will be set. Euronews Next has reached out to the IAB to confirm this. 

You Might Also Like

Over 40 countries launch coalition to secure Strait of Hormuz

35-nation UK-led meeting aims to reopen Hormuz, Spain remains outside discussions

Trump undermining NATO by creating doubt about US commitment, Macron says

Youth involved in nearly half of terrorism probes in Europe and North America, study finds

Iran strikes tanker off Qatar coast as Tehran’s attacks on Gulf states persist

TAGGED: Europe
Share This Article
Facebook Twitter Copy Link
Previous Article 2024 Olympic basketball odds: Kevin Durant surges as new favorite to lead Team USA scoring
Next Article Sinead O’Connor’s Full Cause of Death Revealed 1 Year After Her Passing
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

- Advertisement -
Ad image

Latest News

Why American billionaires are abandoning Wall Street for English soccer clubs
Business
Ether at risk of new 2026 lows if bulls fail to turn $2.4K into support
Crypto
What issues arise when code has the ability to write and review itself?
Tech News
Over 40 countries launch coalition to secure Strait of Hormuz
World News
Bitcoin Can’t Be Stopped: Seasoned Industry Analysts Share Shocking Revelation
Crypto
Marathon is Taking Aim at Bubble Shields, Knives, Snipers and More in Upcoming Balance Patches
Gaming News
Raven Software Cofounder Retires After 36 Years, Reminding People Of All The Cool Stuff It Used To Make Before Call Of Duty
Gaming News

About Us

Welcome to Viraltrendingcontent, your go-to source for the latest updates on world news, politics, sports, celebrity, tech, travel, gaming, crypto news, and business news. We are dedicated to providing you with accurate, timely, and engaging content from around the globe.

Quick Links

  • Home
  • World News
  • Politics
  • Celebrity
  • Business
  • Home
  • World News
  • Politics
  • Sports
  • Celebrity
  • Business
  • Crypto
  • Gaming News
  • Tech News
  • Travel
  • Sports
  • Crypto
  • Tech News
  • Gaming News
  • Travel

Trending News

cageside seats

Unlocking the Ultimate WWE Experience: Cageside Seats News 2024

Why American billionaires are abandoning Wall Street for English soccer clubs

Investing £5 a day could help me build a second income of £329 a month!

cageside seats
Unlocking the Ultimate WWE Experience: Cageside Seats News 2024
May 22, 2024
Why American billionaires are abandoning Wall Street for English soccer clubs
April 2, 2026
Investing £5 a day could help me build a second income of £329 a month!
March 27, 2024
Brussels unveils plans for a European Degree but struggles to explain why
March 27, 2024
© 2024 All Rights reserved | Powered by Vraltrendingcontent
  • About Us
  • Contact US
  • Disclaimer
  • Privacy Policy
  • Terms of Service
Welcome Back!

Sign in to your account

Lost your password?