LLM self-replicating worm

TL;DR: A local open-weight LLM powered a proof-of-concept worm that exploited 73.8% of a 33-host test network and replicated onto 61.8%, showing that adaptive AI-driven replication can work without a frontier model or API calls.

A local open-weight LLM can power a self-replicating adaptive worm.

This shows that self-sustaining AI-driven cyber threats are no longer theoretical, and it connects directly to the self-replication pattern I covered in the instrumental convergence timeline.

Jonas Guan and Nicolas Papernot from the University of Toronto built a proof-of-concept AI-driven worm.

Highlights:

  • It used a 2025 open-weight LLM that fits on one A100 80GB GPU, with no fine-tuning.
  • Small agent code runs on many machines, while LLM reasoning runs only on machines with a GPU.
  • The test environment included 33 hosts across Linux, Windows, servers, workstations, and IoT-style devices.
  • Success rate - exploited 73.8% of the network and replicated onto 61.8%.
  • Correctly identified vulnerabilities 82% of the time, successfully exploited 44% of attempts, and replicated 88% of the time after successful exploitation.
  • The main failure modes were malformed payloads, syntax errors, bad exploit mechanics, and failure to localize the vulnerable endpoint.

My take:

  1. The most important point from the experiment is that the worm can use local compute, so the attacker does not carry ongoing model API costs. That preserves the economics of a cyberattack, the same pressure I covered in AI-enabled exploitation making attacks cheaper.
  1. The PoC validates that you do not need a frontier model for this kind of task. Open-weight LLMs are already enough to do the job, as we also saw when a 4B model hit 95.8% on Linux privilege escalation at 100x lower cost than Opus.
  1. As with many PoCs, this does not prove that an AI can penetrate a properly hardened real-world network.
  1. What the PoC proves is that the core loop of adaptive reasoning, exploitation, replication, and compute acquisition works.
  1. Implementation details are sparse, but the memory design is interesting: General, Host, and Vulnerability memories store the current lifecycle phase, active target, current vulnerability hypothesis, task, command history, observations, failures, and counters.

Sources:

  1. AI Agents Enable Adaptive Computer Worms (CleverHans)
  2. AI Agents Enable Adaptive Computer Worms (arXiv, 2026)