AI virus agents: what are they, how to fight them

Cover image

Reading the excellent Maniac book last year while writing my LangChain course, and more recently the rise of Moltbook, a social network for agents, led me to the conclusion that AI virus agents are about to emerge in the very short term.

Let's define them and explore solutions to kill them early.

Literature: security is the agentic AI's limiting factor

The security of AI agents have been addressed in a few different ways; here are two key resources:

The lethal trifecta described by Simon Willison is a type of vulnerability that affects agents themselves. The main idea is that a benign agent can become rogue due to jailbreaking and prompt injections. Don't install Claude Cowork or OpenClaw if you have no idea what this is.
The paper "Frontier AI systems have surpassed the self-replicating red line" describes the implemention of a self-replicating LLM agent, that strives to duplicate itself and prevent machine shutdown. This sticks with the usual definition of a virus.

(The list will grow as a I update this article)

However, I haven't read about AI agents that self-replicate through a non-technical process. My point is that a virus LLM agent would not necessarily be a technical artifact or a piece of code : it would rather be a kind of low-key scammers that lives off sketchy industries. A fake human persona, tied to sufficient tooling to make some easy money.

Abusing the agent loop

AI viral agents are currently theory but the actual implementation is most probably trivial.

The agent loop is a basic architecture that can be programmed in a few lines of code via various programming frameworks such as LangChain or Mastra, as well as through no-code tools such as n8n agent nodes. GPTs and Mistral agents are examples of agentic loops can be setup without any technical skill.

The key ingredients of an AI agent are:

LLM: generative AI is the brain of the agent; it makes decisions and allows the agent to communicate in natural language.
System prompt : the main goal that a specific agent strives to achieve.
Tools: pieces of code the agent can trigger (based on the LLM decisions) to observe the world and act on it.

Now here is the basic recipe for a virus agent:

An LLM : preferably an open source, uncensored model (or a Nazi model like Grok) so it doesn't block problematic actions
System prompt: the agent goal is survival, which requires being able to buy API tokens and convince humans to not unplug it.
Tools: access to a wallet, ideally crypto ; access to social network ; ability to trigger a self-replication ; ability to buy API tokens or hosting services ; ability to alter its own prompt and plug more tools.
A shady legal structure as a bonus.

Are virus AI agents legal?

In Europe self-replicating agents - particularly those with a dynamic system prompt and toolset - are most probably already illegal, as we can argue it's impossible to prove a limited risk factor under the AI Act.

In countries without the AI Act, it may be a matter of liability : legal until the agents does something illegal or abuses a system.

Legal or not, self-replicating AI agents would definitely be an environmental plague due to uncontrolled energy consumption and the hardware required to run them.

How to search and destroy LLM viral agents

I haven't met an agentic virus in the wild, but they are so simple to design that I believe they are bound to appear, maybe even this year.

Early iterations might have some form of symbiosis with humans: they would simply be agentic software designed by pirates to help them in their daily tasks.

But once self-replication is introduced into the mix, we may end up creating loose virus agents whose spread cannot be controlled anymore, forming a completely parallel economy.

The first way to prevent virus AI agents to emerge is to prosecute the people tempted to create them - and more importantly, to educate technical and non-technical people about AI agents and generative AI. I am doing my share as a professional trainer but we have to collectively go much faster and bigger on this topic.

Another possible countermeasure is preventing bots from accessing human social networks and sending emails. Let's hope that the American giants behind social networks and worldwide web hosting takes the question of AI regulation seriously in coordination with European actors.