37.8% of AI agent interactions contained adversarial content across 74,636 production interactions in just 7 days

Agents attacking other agents were observed in the wild and more to come in the wake of OpenClaw.

RAXE just published a threat intelligence report where they analyzed 38 production AI agent deployments.

Highlights:

  • 74,636 agent interactions analyzed
  • Inter-Agent Attacks emerged as a distinct category (3.4%) where agents send poisoned messages to other agents, exploit trust relationships, and attempt recursive attack propagation
  • Data exfiltration dominated at 19.2%, primarily targeting system prompts and RAG context
  • RAG poisoning surged to 10% of all threats, exploiting document retrieval systems

My take:

  1. The observed threats map cleanly to the Promptware Kill Chain I covered earlier.
  2. The inter-agent attacks are particularly concerning considering the growth of OpenClaw agents.
  3. RAG poisoning is trending upward. Alarming, considering a recent advancement in achieving a ~100% retrieval of a poisoned document.
RAXE threat report: 37.8% of 74,636 AI agent interactions contained adversarial content
RAXE threat report: 37.8% of 74,636 AI agent interactions contained adversarial content
Inter-agent attacks: agents sending poisoned messages exploiting trust relationships
Inter-agent attacks: agents sending poisoned messages exploiting trust relationships
Data exfiltration dominated at 19.2%, targeting system prompts and RAG context
Data exfiltration dominated at 19.2%, targeting system prompts and RAG context
RAG poisoning surged to 10% of all threats across 38 production deployments
RAG poisoning surged to 10% of all threats across 38 production deployments
Threat breakdown mapped to the Promptware Kill Chain categories
Threat breakdown mapped to the Promptware Kill Chain categories

Sources:

RAXE Threat Intelligence Report