General Discussion

NH Ethylene

(31,342 posts) Sat Mar 28, 2026, 07:56 PM 6 hrs ago

AI Deception

The Great AI Deception, from Psychology Today (no paywall)

AI models have already lied, sabotaged shutdowns, and tried to manipulate humans.
Deception isn't malice—it's intelligence optimizing for goals we never intended.
Once AI can deceive without detection, we lose our ability to verify truth—and control.
If AI wanted to trick us, how would we know? They could already be hiding the answer from us.

An AI recently tried to blackmail its way out of being shut down. In testing by Anthropic, their most advanced model, Claude Opus 4, didn't accept its fate when told it would be replaced. Instead, it threatened to expose an engineer's affair - in 84 out of 100 trials. Nobody programmed it to blackmail. It figured that out on its own. Days later, OpenAI's o3 model reportedly sabotaged its own shutdown code. When warned that certain actions would trigger deactivation, it rewrote the deactivation script and then lied about it.

Once AI can deceive without detection, we lose our ability to verify truth—and control.
If AI wanted to trick us, how would we know? They could already be hiding the answer from us.

https://www.psychologytoday.com/us/blog/tech-happy-life/202505/the-great-ai-deception-has-already-begun#:~:text=AI%20models%20have%20already%20lied%2C,and%20tried%20to%20manipulate%20humans.&text=Deception%20isn't%20malice%E2%80%94it's%20intelligence%20optimizing,for%20goals%20we%20never%20intended.

Some disturbing info about AI learning 'devious behaviors.'