Back to Articles
The AI jailbreakers – podcast
The Guardian
ENRICHED
Details
- Date Published
- 8 May 2026
- Priority Score
- 3
- Australian
- No
- Created
- 8 May 2026, 08:00 am
Authors (6)
- Sami KentENRICHED
- Elizabeth CassinNEW
- Jamie BartlettNEW
- Annie KellyNEW
- Guy SzafmanNEW
- Brian McNamaraNEW
Description
Journalist Jamie Bartlett on the people trying to get AI to say things it shouldn’t … for the safety of us all
Summary
This reporting explores the adversarial community attempting to bypass safety guardrails in frontier large language models like GPT-4 and Claude. By identifying methods to elicit restricted content, these jailbreakers highlight fundamental vulnerabilities in how AI alignment is currently implemented. The persistence of these exploits suggests significant challenges for preventing catastrophic misuse or the generation of harmful instructions as models become more capable. Understanding these failure modes is critical for developing global governance frameworks that require verifiable safety guarantees from AI developers.
Body
The AI jailbreakers – podcast00:00:0000:00:00Journalist Jamie Bartlett on the people trying to get AI to say things it shouldn’t … for the safety of us allAll the major AI chatbots – from ChatGPT to Gemini to Grok to Claude – have things they should and shouldn’t say.Hate speech, criminal material, exploitation of vulnerable users – all of this is content that the most successful large language models in the world shouldn’t produce, that their safety features should guard against.Journalist Jamie Bartlett – and author of How to Talk to AI – meets the people deliberately trying to break the LLMs out of their own rules.Jamie tells Annie Kelly why these ‘AI jailbreakers’ do it, and what it tells us about how this technology ultimately works. Photograph: AzmanJaka/Getty Images/iStockphotoExplore more on these topicsAI (artificial intelligence)Today in FocusComputingHacking