LLM self-replicating worm

TL;DR: A local open-weight LLM powered a proof-of-concept worm that exploited 73.8% of a 33-host test network and replicated onto 61.8%, showing that adaptive AI-driven replication can work without a frontier model or API calls.

A local open-weight LLM can power a self-replicating adaptive worm.

This shows that self-sustaining AI-driven cyber threats are no longer theoretical, and it connects directly to the self-replication pattern I covered in the instrumental convergence timeline.

Jonas Guan and Nicolas Papernot from the University of Toronto built a proof-of-concept AI-driven worm.

Highlights:

It used a 2025 open-weight LLM that fits on one A100 80GB GPU, with no fine-tuning.

Small agent code runs on many machines, while LLM reasoning runs only on machines with a GPU.

The test environment included 33 hosts across Linux, Windows, servers, workstations, and IoT-style devices.

Success rate - exploited 73.8% of the network and replicated onto 61.8%.

Correctly identified vulnerabilities 82% of the time, successfully exploited 44% of attempts, and replicated 88% of the time after successful exploitation.

The main failure modes were malformed payloads, syntax errors, bad exploit mechanics, and failure to localize the vulnerable endpoint.

My take:

The most important point from the experiment is that the worm can use local compute, so the attacker does not carry ongoing model API costs. That preserves the economics of a cyberattack, the same pressure I covered in AI-enabled exploitation making attacks cheaper.

The PoC validates that you do not need a frontier model for this kind of task. Open-weight LLMs are already enough to do the job, as we also saw when a 4B model hit 95.8% on Linux privilege escalation at 100x lower cost than Opus.

As with many PoCs, this does not prove that an AI can penetrate a properly hardened real-world network.

What the PoC proves is that the core loop of adaptive reasoning, exploitation, replication, and compute acquisition works.

Implementation details are sparse, but the memory design is interesting: General, Host, and Vulnerability memories store the current lifecycle phase, active target, current vulnerability hypothesis, task, command history, observations, failures, and counters.

Sources: