5 AI security stories from this week (Feb 2–8, 2026)

0-Click RCE in OpenClaw via Gmail Hook

Zero-click RCE on an AI agent through a single email. No link, no attachment. Just a prompt injection payload with a one-character typo that bypasses the regex sanitizer but still triggers the LLM. Agent clones a malicious repo, restarts the gateway, reverse shell.

Opus 4.6 found 500+ vulnerabilities in heavily-fuzzed open source projects

No custom harness, no specialized prompting. Claude read git history, identified missing patches, and exploited subtle algorithmic assumptions that traditional fuzzers couldn't reach.

Google DeepMind: activation probes detect cyber misuse 10,000x cheaper than LLM classifiers

Tiny classifiers reading model internals during inference. A cascade pattern: cheap probe on all traffic, LLM classifier only for uncertain cases. Critical as frontier model cyber capabilities accelerate.

37.8% of AI agent interactions contained adversarial content

RAXE analyzed 74,636 production interactions across 38 deployments. Inter-agent attacks emerged as a new category — agents sending poisoned messages to other agents and exploiting trust relationships.

AWS admin privileges in 8 minutes with LLM assistance

LLMs collapsed the attacker timeline. Recon, privilege escalation through Lambda, and admin access — all in minutes. Objective: LLMjacking and GPUjacking. Token mining is the new cryptomining.

And just in case you missed the OpenClaw horror stories of the week: installing OpenClaw on your local machine to work under your credentials is risky.

Weekly AI security roundup: OpenClaw RCE, Opus 4.6 vulns, DeepMind probes, agent attacks

37.8% of AI agent interactions contained adversarial content across 38 deployments

AWS admin privileges achieved in 8 minutes with LLM-assisted attack chain

Sources: