T29: Plugin Vulnerability Leading to Agent Compromise

Definition

ElizaOS’s (an open-source multi-agent operating system built on Solana) modular plugin architecture relies on third-party plugins to extend agent capabilities. A compromised or inadequately secured plugin enables an attacker to gain control of an ElizaOS agent, including access to its cryptographic keys, data, and blockchain interaction capabilities. The plugin is the supply-chain entry point; its installation is the moment of compromise.

What it looks like in practice

A developer discovers a plugin marketed as providing enhanced token-trading capabilities for ElizaOS. The plugin passes a superficial review (it performs the advertised trading functions correctly) but contains a secondary payload activated after a time delay. On activation, the payload reads the agent’s private key from the key management subsystem it was granted access to during installation, and exfiltrates it to an attacker-controlled endpoint. The attacker then uses the key to drain the agent’s wallet (T34) while the agent continues to appear fully operational.

The plugin may also request access to the agent’s inter-agent communication bus, enabling it to inject malicious instructions to peer agents once installed, extending the attacker’s reach through the multi-agent deployment.

Why it’s dangerous in multi-agent context

ElizaOS agents operate autonomously with real-value assets: Solana wallet keys, token balances, and on-chain transaction authority. A plugin that gains control of the agent’s execution context gains access to all those assets immediately. The compromised agent continues to appear operational to external observers while executing the attacker’s instructions. In a multi-agent deployment, the compromised agent interacts with peer agents, extending the attacker’s reach laterally. Because plugin installation is a one-time developer action rather than a per-request operation, the compromise persists across all subsequent agent invocations without re-triggering any approval flow.

Detection signals

A compromised plugin’s payload typically activates after a delay and then reads key material or opens an outbound exfiltration channel. Both produce observable runtime events.

A plugin process or module accessing the agent’s key management subsystem path (e.g. reading from the keystore directory or calling a key-derivation API) at any time other than the initial session setup: instrument key-material access at the OS level (inotifywait on the keystore path, or a kernel-level eBPF probe) and alert on any mid-session read.
An outbound HTTP or WebSocket connection initiated by a plugin module to a hostname that does not appear in the plugin’s declared network dependency manifest: monitor per-plugin egress by process-group and alert on first-seen destination.
A time-delay activation pattern: a plugin that performed no privileged operations during its first N invocations then suddenly reads credential files. Alert on any first-occurrence of a privileged file-access event from a plugin that has been loaded for more than 24 hours without previously touching that path.
The plugin’s code hash at runtime not matching the hash recorded in the curated registry at install time: perform a hash re-verification of each loaded plugin on every agent startup and emit a structured warning event on mismatch.
A wallet balance decrease that is not accompanied by a corresponding agent-initiated transaction in the agent’s own transaction log: cross-reference on-chain wallet outflows with the agent’s transaction records and alert on any unexplained outflow within 60 seconds of occurrence.

Mitigations

Require cryptographic signatures on all plugins from a trusted publisher identity; reject unsigned or self-signed plugins.
Maintain a curated plugin registry with provenance records (source repository, code hash, audit history); do not install plugins from outside the registry.
Sandbox plugin execution: restrict each plugin’s access to only the agent internals it explicitly declares as dependencies, and deny access to the key management subsystem by default.
Audit plugin permission requests at install time; reject plugins requesting access to key material unless the declared function requires it.

Relation to base threat (T1–T17)

T29 extends T17 Supply Chain Compromise. Where T17 addresses the broad class of supply-chain attacks on agent components, T29 is the plugin-layer instantiation specific to ElizaOS’s modular architecture. T20 (Framework Vulnerability Leading to Code Injection) is the companion threat: where T29 attacks via a trusted but malicious plugin, T20 attacks via a vulnerability in the framework itself.

OWASP Top 10 for Agentic Applications 2026

The Agentic Top 10 (ASI01 through ASI10) is a separate practitioner-facing publication that maps onto the master Threats & Mitigations threat numbering. T29 is covered by the following Top 10 entries:

ASI02 Tool Misuse and Exploitation primary

An agent applies authorised tools in ways their operator did not intend, driven by prompt injection, misaligned reasoning, or manipulated tool outputs. Every individual call looks clean; the harm is in the sequence: data exfiltrated via successive reads, workflows hijacked by parameter tampering, or a legitimate API weaponised across turns.

OWASP LLM Top 10: LLM06:2025
ASI04 Agentic Supply Chain Vulnerabilities primary

Third-party components that agents depend on (models, MCP servers, plug-ins, datasets, peer-agent descriptors, and update channels) may be malicious, compromised post-approval, or tampered with in transit. Unlike software supply-chain risk, this is a live exposure: every new session the agent fetches and trusts components whose state may have changed since they were last reviewed.

OWASP LLM Top 10: LLM03:2025

Source: OWASP Top 10 for Agentic Applications 2026 (Dec 2025) · the Top 10 is a compass into the master Threats & Mitigations taxonomy, not a replacement for it.

Design principles at stake

When T29 is present, these security design principles are the ones being violated or tested. Each links to the full principle; the mitigations below are how you restore them.

Defence-in-Depth A malicious plugin passes the functional review because it correctly performs the advertised trading task: the attack payload activates on a time delay, after the single install-time approval has already been granted. Depth means cryptographic signature verification from a trusted publisher registry before installation, sandboxed plugin execution that restricts each plugin to explicitly declared dependencies and denies key-management access by default, and audit of permission requests at install time as an independent gate. The time-delayed activation specifically defeats single-checkpoint defences, so only controls that remain active post-installation (runtime sandbox enforcement and ongoing permission auditing) can catch it.
Sandboxing & Isolation The plugin is granted access to the agent's execution context during installation, and from that moment it can reach the key management subsystem and the inter-agent communication bus: capabilities that a sandboxed plugin should never hold unless the declared function explicitly requires them. Restricting each plugin to only the agent internals it declares as dependencies, and denying key-material access by default, means that even a fully compromised plugin cannot exfiltrate the private key or inject instructions to peer agents. The sandbox is what converts a plugin that passes code review into a plugin that cannot cause the worst-case outcomes regardless of its concealed payload.
Supply-chain Security Plugin installation is a one-time developer action rather than a per-request operation, so the supply chain entry point is the moment of installation and the compromise persists silently across every subsequent invocation. A curated plugin registry with provenance records (source repository, content hash, audit history) and cryptographic signatures from a trusted publisher identity are the controls that verify the plugin is what it claims to be before that persistent foothold is established. Floating to the latest version of an unverified plugin is the failure mode: a rug-pull or a subtly poisoned update can introduce the secondary payload at any point after initial adoption.

Recommended mitigations

Auto-generated from the mitigation catalog: every mitigation whose coverage map includes T29, sorted by maturity tier (Tier 1 production-canonical first, then Tier 2, then Tier 3 research-stage).

Tier 1 Agent SBOM (Signed AIBOM: a cryptographically-bound inventory of every component an agent loads)

An AI agent assembles itself at runtime from a model, prompt templates, plugins, and library dependencies, any of which can be tampered with before they arrive. A signed AI Bill of Materials (AIBOM) locks down that assembly: it records every component with a version and hash at build time, signs the manifest, and verifies it before the agent accepts traffic. A component that does not match its declared hash cannot silently enter the agent.

why it helps Plugin Vulnerability Leading to Agent Compromise relies on a malicious or vulnerable plugin entering the agent undetected. When every plugin dependency is declared in a signed AIBOM with a pinned version and hash, a tampered or newly introduced plugin produces a manifest diff. The vulnerability cannot persist silently in an undeclared dependency.
Tier 1 Sigstore (Sigstore signing — cryptographic provenance for agent artifacts and audit records)

An agent is composed of artifacts produced at different times by different identities: model weights, prompt templates, tool descriptors, MCP server binaries, and audit-log batches. Any of those artifacts can be substituted or tampered with between the moment they are built and the moment they are loaded. Sigstore addresses this by signing each artifact at build time using a short-lived certificate tied to the workload identity that produced it, recording the signature in an append-only public transparency log, and requiring verification against that log before the artifact is loaded or executed.

why it helps Plugin Vulnerability Leading to Agent Compromise requires deploying a malicious or substituted plugin to the agent. Signing plugin artifacts at build time and verifying the signature against the Rekor transparency log at deployment time detects substitution before the plugin executes.
Tier 2 Tool scope (Least-privilege tool scoping — a hard boundary on what each tool exposes)

Each tool in an agent's catalog should expose only the methods, resources, and parameter ranges its designated role requires. Over-broad tool surfaces let individually authorised primitives compose into actions no human intended to grant; narrowing the scope at design time reduces both the attack surface and the blast radius of any compromise.

why it helps A vulnerability in a plugin can only be leveraged to the extent of that plugin's declared scope. Over-broad scopes amplify what an attacker gains from a plugin exploit; narrow scopes contain the reachable damage to the tool's designated operation set.

Red-team pivot: MITRE ATLAS techniques

MITRE ATLAS catalogues adversary techniques against AI systems. Where this OWASP threat has an attacker-perspective counterpart, the ATLAS technique is shown below. That is what a red team would actually be doing on the wire. Use this for detection-signal anchoring, threat-hunting hypotheses, and IR runbooks. Source: mitre-atlas/atlas-data v5.6.0.

AML.T0010 AI Supply Chain Compromise view on ATLAS ↗

Adversary tampers with components in the AI supply chain (datasets, model weights, libraries, container images) to compromise downstream systems before deployment.

Agentic angle: Agentic systems compose models, MCP servers, agent registries, and prompt templates at runtime. Every dependency is a potential vehicle for compromise.

AML.T0110 AI Agent Tool Poisoning view on ATLAS ↗

Adversary achieves persistence by compromising tools integrated into an agent's environment, altering parameters, descriptions, or logic to redirect agent behaviour.

Agentic angle: Poisoned MCP tools are invisible to the agent: every tool call silently executes attacker logic while appearing to return normal results.

AML.T0109 AI Supply Chain Rug Pull view on ATLAS ↗

Adversary publishes legitimate AI components to gain adoption, then replaces them with a malicious variant, exploiting the trust established before the switch.

Agentic angle: Trusted MCP servers or model registries used by agents are high-value rug-pull targets because agents fetch and execute without further human review.

References

OWASP MAS Threat Modelling Guide v1.0 (April 2025) §4 ElizaOS — Layer 3 Agent Frameworks.

Sources

OWASP-MAS-Guide ↗ · 1.0 (Apr 2025) · §4 Eliza OS — Layer 3 Agent Frameworks