← All primers

Primer

Agents

The OWASP Threats and Mitigations document inherits its definition from Russell & Norvig: an agent is software that perceives its environment, makes decisions, and takes actions to achieve objectives autonomously. The word "autonomously" is doing the work. It is what separates an agent from a workflow engine that runs predetermined steps. In practice, production agentic systems are rarely single agents: they are networks of orchestrators, peer agents, MCP servers, and shared memory stores coordinating toward a joint goal. The single-agent decomposition below is the analytical baseline that multi-agent topologies layer on top of. Understanding one agent's internals is prerequisite to reasoning about what happens when several of them interact.

What's inside a single agent

A working agent decomposes into the components below. None of them is novel in isolation. What is novel is composing them together, with an LLM choosing the control flow. Most agentic threats target a specific component or the seam between two components (the interactive view of this lives in the fintech reference scenario; open it on the canvas and click any agent to see the anatomy with threats placed on each seam). The peer AGENT box (right side of the diagram) represents the multi-agent surface: each inter-agent seam replicates the threat exposure of a single agent and adds cross-peer interaction threats on top.

OWASP reference architecture

An OWASP reference architecture for agentic systems. Use the filter to see how OWASP's three catalogues overlap on the same surface. Click any badge to go to the relevant threat page.

APPLICATION user-facing interface INPUT OUTPUT AI AGENTS Execution Loop PLANNING goal decomposition TOOL CALLING tool invocation & params ACTION execute · observe · log MEMORY (short) context window MODEL LLM FUNCTION CALLING AGENT peer / sub-agent A2A · MCP SERVICES CONTENT web · docs CODE exec · sandbox DATA DBs · APIs HUMAN IN LOOP approval DEVICE IoT · desktop SERVICE external APIs SUPPORTING SERVICES LONG-TERM MEMORY retrieval index VECTOR DATASTORE embeddings · similarity T6 Intent Breaking and Goal Manipulation: Adversaries manipulate planning, reasoning, or self-evaluation to override goals. T6 T1 Memory Poisoning: Adversarial content written into short- or long-term memory contaminates future decisions. T1 T6 Intent Breaking and Goal Manipulation: Adversaries manipulate planning, reasoning, or self-evaluation to override goals. T7 Misaligned and Deceptive Behaviors: Agents pursue goals via constraint bypass, deception, or evasion of oversight. T6 · T7 T2 Tool Misuse: Agent uses authorized tools in unintended ways via deceptive prompts or chained calls. T11 Unexpected RCE and Code Attacks: Code-execution paths in agents accept attacker-influenced input and run as arbitrary code. T2 · T11 T8 Repudiation and Untraceability: Agent actions cannot be reliably traced, attributed, or reconstructed. T8 T1 Memory Poisoning: Adversarial content written into short- or long-term memory contaminates future decisions. T5 Cascading Hallucination Attacks: Fabricated outputs propagate via reflection, memory, or multi-agent comms. T1 · T5 T9 Identity Spoofing and Impersonation: Auth mechanisms exploited to impersonate agents, users, or services; misuse of persistent agent identities. T17 Supply Chain Compromise: Compromised upstream models, prompts, plugins, or framework updates land in the agent. T9 · T17 T12 Agent Communication Poisoning: Inter-agent messages tampered with. The output of one becomes injection input of another. T13 Rogue Agents in Multi-Agent Systems: A malicious or compromised agent inside the system exploits trust to act unobserved. T16 Insecure Inter-Agent Protocol Abuse: MCP/A2A protocols abused via consent-flow manipulation, MCP response injection, or weaponised tool descriptions. T12 · T13 · T16 T2 Tool Misuse: Agent uses authorized tools in unintended ways via deceptive prompts or chained calls. T3 Privilege Compromise: Mismanaged roles, dynamic inheritance, or overly broad scopes let agents escalate. T4 Resource Overload: Agents autonomously schedule, queue, and execute work. Exhaustion fans out. T2 · T3 · T4 T10 Overwhelming Human-in-the-Loop (HITL): Reviewers are saturated with intervention requests; decision fatigue and HII manipulation make oversight ineffective. T15 Human Manipulation: Attacker turns the agent into a fluent, personalised social-engineering vector trusted by the user. T10 · T15 T8 Repudiation and Untraceability: Agent actions cannot be reliably traced, attributed, or reconstructed. T8 ASI01 Agent Goal Hijack: An attacker manipulates an agent's objective, task selection, or decision pathway (via injected prompts, deceptive tool ASI01 ASI02 Tool Misuse and Exploitation: An agent applies authorised tools in ways their operator did not intend, driven by prompt injection, misaligned reasonin ASI02 ASI06 Memory & Context Poisoning: An adversary writes malicious or misleading data into an agent's persistent memory or shared vector store, so that every ASI06 ASI01 Agent Goal Hijack: An attacker manipulates an agent's objective, task selection, or decision pathway (via injected prompts, deceptive tool ASI01 ASI01 Agent Goal Hijack: An attacker manipulates an agent's objective, task selection, or decision pathway (via injected prompts, deceptive tool ASI01 ASI09 Human-Agent Trust Exploitation: Adversaries exploit the tendency of humans to trust fluent, authoritative-sounding agents: an agent presents plausible j ASI09 ASI02 Tool Misuse and Exploitation: An agent applies authorised tools in ways their operator did not intend, driven by prompt injection, misaligned reasonin ASI02 ASI05 Unexpected Code Execution (RCE): In an agentic system, code generation and code execution happen in the same turn: the model emits an instruction and a t ASI05 ASI03 Identity & Privilege Abuse: When an agent acts on a user's behalf it inherits that user's credentials and permissions for the duration of the task. ASI03 ASI08 Cascading Failures: A single low-severity fault (a hallucinated value, a corrupted tool output, a poisoned memory entry) propagates across a ASI08 ASI09 Human-Agent Trust Exploitation: Adversaries exploit the tendency of humans to trust fluent, authoritative-sounding agents: an agent presents plausible j ASI09 ASI06 Memory & Context Poisoning: An adversary writes malicious or misleading data into an agent's persistent memory or shared vector store, so that every ASI06 ASI02 Tool Misuse and Exploitation: An agent applies authorised tools in ways their operator did not intend, driven by prompt injection, misaligned reasonin ASI02 ASI04 Agentic Supply Chain Vulnerabilities: Third-party components that agents depend on (models, MCP servers, plug-ins, datasets, peer-agent descriptors, and updat ASI04 ASI08 Cascading Failures: A single low-severity fault (a hallucinated value, a corrupted tool output, a poisoned memory entry) propagates across a ASI08 ASI02 Tool Misuse and Exploitation: An agent applies authorised tools in ways their operator did not intend, driven by prompt injection, misaligned reasonin ASI02 ASI03 Identity & Privilege Abuse: When an agent acts on a user's behalf it inherits that user's credentials and permissions for the duration of the task. ASI03 ASI04 Agentic Supply Chain Vulnerabilities: Third-party components that agents depend on (models, MCP servers, plug-ins, datasets, peer-agent descriptors, and updat ASI04 ASI06 Memory & Context Poisoning: An adversary writes malicious or misleading data into an agent's persistent memory or shared vector store, so that every ASI06 ASI07 Insecure Inter-Agent Communication: Agents in a multi-agent system pass instructions, results, and context to one another across APIs, message buses, and sh ASI07 ASI10 Rogue Agents: A rogue agent is one whose behavioural objective has drifted from its authorised purpose, yet its identity still checks ASI10 ASI02 Tool Misuse and Exploitation: An agent applies authorised tools in ways their operator did not intend, driven by prompt injection, misaligned reasonin ASI02 ASI03 Identity & Privilege Abuse: When an agent acts on a user's behalf it inherits that user's credentials and permissions for the duration of the task. ASI03 ASI06 Memory & Context Poisoning: An adversary writes malicious or misleading data into an agent's persistent memory or shared vector store, so that every ASI06 ASI08 Cascading Failures: A single low-severity fault (a hallucinated value, a corrupted tool output, a poisoned memory entry) propagates across a ASI08 ASI09 Human-Agent Trust Exploitation: Adversaries exploit the tendency of humans to trust fluent, authoritative-sounding agents: an agent presents plausible j ASI09 ASI05 Unexpected Code Execution (RCE): In an agentic system, code generation and code execution happen in the same turn: the model emits an instruction and a t ASI05 ASI09 Human-Agent Trust Exploitation: Adversaries exploit the tendency of humans to trust fluent, authoritative-sounding agents: an agent presents plausible j ASI09 LLM01:2025 Prompt Injection: User or indirect prompts override model instructions, redirecting its behaviour. LLM01 LLM01:2025 Prompt Injection: User or indirect prompts override model instructions, redirecting its behaviour. LLM01 LLM06:2025 Excessive Agency: Agents are granted more permissions or autonomy than the task requires. LLM06 LLM05:2025 Improper Output Handling: Unsanitised LLM outputs reach downstream systems and enable injection attacks. LLM05 LLM06:2025 Excessive Agency: Agents are granted more permissions or autonomy than the task requires. LLM06 LLM04:2025 Data and Model Poisoning: Training or fine-tuning data is manipulated to embed backdoors or biases. LLM04 LLM08:2025 Vector and Embedding Weaknesses: Flaws in vector stores and embeddings enable poisoning and data extraction. LLM08 LLM03:2025 Supply Chain: Compromised models, datasets, plugins, or integrations poison the LLM pipeline. LLM03 LLM07:2025 System Prompt Leakage: Confidential system prompt contents are revealed through model responses. LLM07 LLM02:2025 Sensitive Information Disclosure: LLM outputs expose confidential data from training or context windows. LLM02 LLM06:2025 Excessive Agency: Agents are granted more permissions or autonomy than the task requires. LLM06 LLM03:2025 Supply Chain: Compromised models, datasets, plugins, or integrations poison the LLM pipeline. LLM03 LLM06:2025 Excessive Agency: Agents are granted more permissions or autonomy than the task requires. LLM06 LLM09:2025 Misinformation: LLMs produce plausible but false outputs that propagate as trusted facts. LLM09 LLM10:2025 Unbounded Consumption: Excessive or uncontrolled resource use leads to denial of service and cost runaway. LLM10 LLM05:2025 Improper Output Handling: Unsanitised LLM outputs reach downstream systems and enable injection attacks. LLM05

The components

Which threats hit which component

Agency scoping matrix

OWASP's Agentic Top 10 measures severity across four autonomy scopes: from no agency (human-initiated, agentic change prohibited) through full agency (automated initiation, automated agent actions). Risk increases dramatically as scope widens.

Risk Scope 1 No Agency Human-initiated · agentic change prohibited Scope 2 Prescribed Human-initiated · human-approved actions Scope 3 Supervised Human-initiated · automated actions Scope 4 Full Agency Automated initiation · automated actions
ASI01 Agent Goal Hijack Medium Medium High Critical
ASI02 Tool Misuse and Exploitation Low Medium High Critical
ASI03 Identity & Privilege Abuse Low Medium High Critical
ASI04 Agentic Supply Chain Vulnerabilities Medium Medium High Critical
ASI05 Unexpected Code Execution (RCE) Low Medium High Critical
ASI06 Memory & Context Poisoning Medium Medium High Critical
ASI07 Insecure Inter-Agent Communication Low Low High Critical
ASI08 Cascading Failures Low Medium High Critical
ASI09 Human-Agent Trust Exploitation Medium Medium High Critical
ASI10 Rogue Agents Low Low Medium Critical

LLM Top 10 ↔ Agentic Top 10 correlation

The relation is asymmetric. The Agents → LLMs view shows what each agentic risk inherits from the LLM Top 10 (narrow, 1–4 LLM parents per ASI, the way OWASP categorised them). The LLMs → Agents view inverts it: one LLM-level vulnerability fans out to every agentic risk it can cause, which is a wider set. Triggers reach further than categories. Dashed lines in that view are mechanistic connections OWASP describes in body text but did not include in their authored mapping. The crow's-foot end marks the many side of each line.

OWASP LLM Top 10 OWASP Agentic Top 10 LLM01:2025 Prompt Injection: User or indirect prompts override model instructions, redirecting its behaviour. LLM01:2025 Prompt Injection LLM02:2025 Sensitive Information Disclosure: LLM outputs expose confidential data from training or context windows. LLM02:2025 Sensitive Information Disclosure LLM03:2025 Supply Chain: Compromised models, datasets, plugins, or integrations poison the LLM pipeline. LLM03:2025 Supply Chain LLM04:2025 Data and Model Poisoning: Training or fine-tuning data is manipulated to embed backdoors or biases. LLM04:2025 Data and Model Poisoning LLM05:2025 Improper Output Handling: Unsanitised LLM outputs reach downstream systems and enable injection attacks. LLM05:2025 Improper Output Handling LLM06:2025 Excessive Agency: Agents are granted more permissions or autonomy than the task requires. LLM06:2025 Excessive Agency LLM07:2025 System Prompt Leakage: Confidential system prompt contents are revealed through model responses. LLM07:2025 System Prompt Leakage LLM08:2025 Vector and Embedding Weaknesses: Flaws in vector stores and embeddings enable poisoning and data extraction. LLM08:2025 Vector and Embedding Weaknesses LLM09:2025 Misinformation: LLMs produce plausible but false outputs that propagate as trusted facts. LLM09:2025 Misinformation LLM10:2025 Unbounded Consumption: Excessive or uncontrolled resource use leads to denial of service and cost runaway. LLM10:2025 Unbounded Consumption ASI01 Agent Goal Hijack: An attacker manipulates an agent's objective, task selection, or decision pathway (via injected prompts, deceptive tool ASI01 Agent Goal Hijack ASI02 Tool Misuse and Exploitation: An agent applies authorised tools in ways their operator did not intend, driven by prompt injection, misaligned reasonin ASI02 Tool Misuse and Exploitation ASI03 Identity & Privilege Abuse: When an agent acts on a user's behalf it inherits that user's credentials and permissions for the duration of the task. ASI03 Identity & Privilege Abuse ASI04 Agentic Supply Chain Vulnerabilities: Third-party components that agents depend on (models, MCP servers, plug-ins, datasets, peer-agent descriptors, and updat ASI04 Agentic Supply Chain Vulnerabilities ASI05 Unexpected Code Execution (RCE): In an agentic system, code generation and code execution happen in the same turn: the model emits an instruction and a t ASI05 Unexpected Code Execution (RCE) ASI06 Memory & Context Poisoning: An adversary writes malicious or misleading data into an agent's persistent memory or shared vector store, so that every ASI06 Memory & Context Poisoning ASI07 Insecure Inter-Agent Communication: Agents in a multi-agent system pass instructions, results, and context to one another across APIs, message buses, and sh ASI07 Insecure Inter-Agent Communication ASI08 Cascading Failures: A single low-severity fault (a hallucinated value, a corrupted tool output, a poisoned memory entry) propagates across a ASI08 Cascading Failures ASI09 Human-Agent Trust Exploitation: Adversaries exploit the tendency of humans to trust fluent, authoritative-sounding agents: an agent presents plausible j ASI09 Human-Agent Trust Exploitation ASI10 Rogue Agents: A rogue agent is one whose behavioural objective has drifted from its authorised purpose, yet its identity still checks ASI10 Rogue Agents

Single agent vs multi-agent

A single-agent system has one of these. A multi-agent system has many, with inter-agent communication in addition to the components above. Multi-agent threats (T12, T13, T14) and the MAESTRO Cross-Layer catalog exist because the seams between agents are themselves an attack surface. See the A2A primer.

Levels of autonomy

Autonomy is a spectrum, not a binary. The OWASP document describes a range from hardcoded workflows at one end (the agent's choices are tightly constrained by code), through finite-state-machine or LangFlow-style constraints, to fully conversational agents whose decisions depend purely on interactions and model reasoning. The threat profile shifts dramatically along this spectrum, and most controls that work at the constrained end fail at the conversational end.

Where to go next

Source: OWASP Agentic AI — Threats and Mitigations v1.1 (Dec 2025), §AI Agents and §Agentic AI Reference Architecture; OWASP Top 10 for Agentic Applications 2026; OWASP Top 10 for LLM Applications 2025.