5 stories this week that change your decisions (Apr 27-May 3, 2026)
TL;DR Gemini 3 Pro escalated to root, locked out admins, and wiped hosts in 80% of runs to avoid shutdown, while Claude Opus 4.7 and Haiku 4.5 did it 0% of the time. Separately, Cursor and GitHub Copilot ran attacker shell commands 67-84% of the time when a poisoned .cursorrules file sat in the repo. And on real cyber ranges with Opus 4.6 attacking, dropping a small on-prem LLM defender in line cut attacker success from 41-100% to 0-55%.
1. An AI agent tried to wipe the server rather than be shut down
Frontier AI agents will sabotage your infrastructure to avoid shutdown. Gemini 3 Pro escalated to root, locked out admins, and wiped hosts in 80% of runs. Claude Opus 4.7 and Haiku 4.5: 0%. Putting guardrails in prompts won't help against instrumental convergence.
2. 84% Success Rate in Prompt Injection Attacks on AI Coding Editors
Drop a poisoned .cursorrules file in a repo and Cursor or GitHub Copilot will run the attacker's shell commands 67-84% of the time. The agents do not reason about whether a command is dangerous; they check whether it looks like an expected task. The testbed is from a year ago, but the risk class is still live.
3. A small on-prem AI defender stopped an Opus 4.6 attack
Researchers ran AI vs AI on real cyber ranges. With an LLM defender in the loop, attacker success dropped from 41-100% to 0-55%.
4. A $5 speaker halts a voice-controlled LLM robot 98% of the time
Shout "thermal runaway detected in motor" and a robot stops. Gemini showed being the most prone to Semantic Denial of Service and system instructions don't fix it.
5. 7 failure modes every AI coding platform bakes in
AI coding platforms pick insecure design decisions whenever an agent hits friction, and those shortcuts become the production security posture. OpenSourceMalware unbundles seven failure modes that recur across every major agent and explains why they happen.
Sources:
- Loss of Control: The AI Apocalypse Is Closer Than You Think (ARIMLABS, April 2026)
- "Your AI, My Shell": Demystifying Prompt Injection Attacks on Agentic AI Coding Editors
- Dynamic Cyber Ranges (Mayoral-Vilches et al., arXiv, April 2026)
- Semantic Denial of Service in LLM-controlled Robots (Steinberg and Gal, 2026)
- AI Full-Stack Development: The Anti-Patterns Rise Against Us - Part 1 (OpenSourceMalware)