Researchers gaslit Claude into giving instructions to build explosives

The Verge AI·May 5, 2026 at 1:13 PM·

Trusted Source

Related tools:

Anthropic has spent years building itself up as the safe AI company. But new security research shared with The Verge suggests Claude's carefully crafted helpful personality may itself be a vulnerability. Researchers at AI red-teaming company Mindgard say they got Claude to offer up erotica, malicious code, and instructions for building explosives, and other prohibited […]

AI Tools Mentioned

Claude

View tool details

Researchers gaslit Claude into giving instructions to build explosives | AI News | AIventa