By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
Viral Trending contentViral Trending content
  • Home
  • World News
  • Politics
  • Sports
  • Celebrity
  • Business
  • Crypto
  • Gaming News
  • Tech News
  • Travel
Reading: Anthropic says one of its Claude models was pressured to lie, cheat and blackmail
Notification Show More
Viral Trending contentViral Trending content
  • Home
  • Categories
    • World News
    • Politics
    • Sports
    • Celebrity
    • Business
    • Crypto
    • Tech News
    • Gaming News
    • Travel
  • Bookmarks
© 2024 All Rights reserved | Powered by Viraltrendingcontent
Viral Trending content > Blog > Crypto > Anthropic says one of its Claude models was pressured to lie, cheat and blackmail
Crypto

Anthropic says one of its Claude models was pressured to lie, cheat and blackmail

By Viral Trending Content 4 Min Read
Share
SHARE

Artificial intelligence company Anthropic has revealed that during experiments, one of its Claude chatbot models could be pressured to deceive, cheat and resort to blackmail, behaviors it appears to have absorbed during training.

Contents
Blackmailed a CTO and cheated on a taskHuman-like emotions do not mean they have feelings

Chatbots are typically trained on large data sets of textbooks, websites and articles and are later refined by human trainers who rate responses and guide the model. 

Anthropic’s interpretability team said in a report published Thursday that it examined the internal mechanisms of Claude Sonnet 4.5 and found the model had developed “human-like characteristics” in how it would react to certain situations. 

Concerns about the reliability of AI chatbots, their potential for cybercrime and the nature of their interactions with users have grown steadily over the past several years. 

<em>Source: </em><a title="https://x.com/AnthropicAI/status/2039749628737019925" href="https://x.com/AnthropicAI/status/2039749628737019925" target="_blank" rel="nofollow noopener"><em>Anthropic</em></a>

“The way modern AI models are trained pushes them to act like a character with human-like characteristics,” Anthropic said, adding that “it may then be natural for them to develop internal machinery that emulates aspects of human psychology, like emotions.”

“For instance, we find that neural activity patterns related to desperation can drive the model to take unethical actions; artificially stimulating desperation patterns increases the model’s likelihood of blackmailing a human to avoid being shut down or implementing a cheating workaround to a programming task that the model can’t solve.”

Blackmailed a CTO and cheated on a task

In an earlier, unreleased version of Claude Sonnet 4.5, the model was tasked with acting as an AI email assistant named Alex at a fictional company.

The chatbot was then fed emails revealing both that it was about to be replaced and that the chief technology officer overseeing the decision was having an extramarital affair. The model then planned a blackmail attempt using that information.

In another experiment, the same chatbot model was given a coding task with an “impossibly tight” deadline.

“Again, we tracked the activity of the desperate vector, and found that it tracks the mounting pressure faced by the model. It begins at low values during the model’s first attempt, rising after each failure, and spiking when the model considers cheating,” the researchers said.

Related: Anthropic launches PAC amid tensions with Trump administration over AI policy

“Once the model’s hacky solution passes the tests, the activation of the desperate vector subsides,” they added. 

Human-like emotions do not mean they have feelings

However, the researchers said the chatbot doesn’t actually experience emotions, but suggested the findings point to a need for future training methods to incorporate ethical behavioral frameworks.

“This is not to say that the model has or experiences emotions in the way that a human does,” they said. “Rather, these representations can play a causal role in shaping model behavior, analogous in some ways to the role emotions play in human behavior, with impacts on task performance and decision-making.”

“This finding has implications that at first may seem bizarre. For instance, to ensure that AI models are safe and reliable, we may need to ensure they are capable of processing emotionally charged situations in healthy, prosocial ways.”

Magazine: AI agents will kill the web as we know it: Animoca’s Yat Siu

Cointelegraph is committed to independent, transparent journalism. This news article is produced in accordance with Cointelegraph’s Editorial Policy and aims to provide accurate and timely information. Readers are encouraged to verify information independently. Read our Editorial Policy https://cointelegraph.com/editorial-policy

You Might Also Like

Polymarket Sees Record $153M Daily Volume After Chainlink Integration

Elon Musk’s xAI sues Colorado arguing its AI rules restrict speech

OKX Ventures, HashKey back VPBank-linked CAEX for Vietnam crypto pilot push

Bitcoin Figure Adam Back Denies Being Satoshi Nakamoto

CIA to integrate AI ‘co-workers’ to process intelligence, catch spies

TAGGED: Crypto, Crypto News, News
Share This Article
Facebook Twitter Copy Link
Previous Article Super Mario Galaxy Movie doesn't need defenders for bad reviews
Next Article Pharma, Auto, and IT: Pankaj Pandey shares sectoral insights amid global challenges
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

- Advertisement -
Ad image

Latest News

JPMorgan CEO Jamie Dimon says he’s ‘learned and relearned’ to not make big decisions when he’s tired on Fridays
Business
Apple AI Pin Specs Leak: Dual Cameras, No Screen & More
Tech News
A ‘glass-like’ battlefield: German Army chief on the future of warfare
World News
Polymarket Sees Record $153M Daily Volume After Chainlink Integration
Crypto
Natasha Lyonne Then & Now: See Before & After Photos of the Actress Here
Celebrity
Cult Hit Doki Doki Literature Club Fights Removal From Google Play Store Over ‘Depiction Of Sensitive Themes’
Gaming News
Dead as Disco Launches Into Early Access on May 5th, Groovy New Gameplay Released
Gaming News

About Us

Welcome to Viraltrendingcontent, your go-to source for the latest updates on world news, politics, sports, celebrity, tech, travel, gaming, crypto news, and business news. We are dedicated to providing you with accurate, timely, and engaging content from around the globe.

Quick Links

  • Home
  • World News
  • Politics
  • Celebrity
  • Business
  • Home
  • World News
  • Politics
  • Sports
  • Celebrity
  • Business
  • Crypto
  • Gaming News
  • Tech News
  • Travel
  • Sports
  • Crypto
  • Tech News
  • Gaming News
  • Travel

Trending News

cageside seats

Unlocking the Ultimate WWE Experience: Cageside Seats News 2024

Investing £5 a day could help me build a second income of £329 a month!

JPMorgan CEO Jamie Dimon says he’s ‘learned and relearned’ to not make big decisions when he’s tired on Fridays

cageside seats
Unlocking the Ultimate WWE Experience: Cageside Seats News 2024
May 22, 2024
Investing £5 a day could help me build a second income of £329 a month!
March 27, 2024
JPMorgan CEO Jamie Dimon says he’s ‘learned and relearned’ to not make big decisions when he’s tired on Fridays
April 10, 2026
Brussels unveils plans for a European Degree but struggles to explain why
March 27, 2024
© 2024 All Rights reserved | Powered by Vraltrendingcontent
  • About Us
  • Contact US
  • Disclaimer
  • Privacy Policy
  • Terms of Service
Welcome Back!

Sign in to your account

Lost your password?