Attack Surface Minimization · Principles

Why it matters for agentic AI

Economy of mechanism (keep the design as simple as possible, remove everything not strictly necessary) has always been about reducing the number of paths an attacker can exploit. In classical systems, attack surface was primarily about open ports, exposed APIs, and code complexity. For agents, the surface is radically larger, because it is not just the entry points into the system but everything that can write into the agent’s context window from the outside world.

Every registered tool is an injection vector: its output returns into the context as trusted content, even if the remote endpoint that produced it is not trustworthy. Every external source the agent reads from (emails, web pages, RAG results, tool responses, messages from other agents) is a surface. Persistent memory is a long-dwell surface: an attacker who can write a single poisoned document into an agent’s vector store has placed an instruction that may be retrieved weeks later. Every notch of autonomy the agent is granted is also a surface: each human checkpoint removed in the name of speed eliminates an opportunity to catch a deviation before it becomes irreversible.

What makes this qualitatively different from the classical case is non-determinism. The effective attack surface of an agent is not the set of tool-call sequences a designer intended or observed in testing. It is the set of all reachable tool-call sequences, including sequences the designer never imagined, that a motivated attacker could induce by crafting input the agent will process. An agent with fifty tools and a flexible instruction set has an enormous combinatorial space of reachable sequences. An agent with five tools scoped to a specific task has a surface an order of magnitude smaller, and the chains an injection can construct are correspondingly shorter and less dangerous.

The practical implication is that toolset size is a security property, not just a capability property. Before adding a tool, the question is not only “would this tool be useful?” but “does adding this tool create new chains that could not exist without it?” Code execution plus file read plus external HTTP is a qualitatively different surface than any of those three alone. The combination is what makes exfiltration possible; removing any leg of the chain removes the path.

Scenario: the over-tooled agent

An agent is provisioned with fifty tools “to be flexible” for a data-analysis task. During processing it ingests a malicious document. The injection chains together three tools the task never needed (email compose, file read from a shared drive, and external HTTP POST) into an exfiltration path. None of those three tools were necessary to analyse data. The five tools the task actually required (read dataset, run query, write output, call an analytics API, open a results ticket) would not have offered the injection a usable chain.

Scenario: the unbounded RAG context

An agent’s retrieval configuration loads the twenty most-similar documents from a company-wide knowledge base on every query. A single poisoned document planted in the knowledge base is retrieved in unrelated queries because its embedding is close to common terms. Every retrieval is a new opportunity for the injected instruction to land in context. Task-scoped RAG, using a separate curated index per workflow refreshed only when the workflow requires it, reduces the injection opportunity to the specific documents that task actually needs and removes the standing connection to the broader knowledge base entirely.

How it fails

Agents are over-tooled during development for convenience and the broad toolset is never reviewed before production deployment.
Entire knowledge bases are loaded as standing RAG connections; no one audits which documents are reachable, and poisoned entries persist indefinitely.
Dynamic tool discovery is left enabled in production, so the toolset can expand at runtime without any security review of what was added.
Persistent memory accumulates unboundedly across sessions, turning the agent’s memory into a large, poorly-audited surface for durable injections.
Human checkpoints are removed iteratively (“the agent is reliable enough”) without re-evaluating what the new autonomous surface enables.

Why the mapped controls work

Tool-necessity review with a per-agent ceiling forces the question “is this tool actually required?” before deployment, rather than after an incident. The ceiling enforces that the answer is reviewed periodically, not just once. Task-scoped RAG replaces standing knowledge-base connections with purpose-built indices: the retrieval surface is as small as the task requires, and there is no path from a poisoned document in a different workflow to the current agent’s context. Session-scoped memory by default eliminates long-dwell persistence: an injection that succeeds in writing to memory expires with the session, rather than incubating until an unknown future retrieval. Static toolsets in production with no dynamic discovery mean the attack surface is a fixed, auditable set that does not change between deployments, removing the runtime-expansion path entirely.

First steps

Run a tool-necessity audit this week: list every tool registered to each agent, mark each as “required” or “not required for current tasks”, and remove the unmarked ones before your next deployment.
Replace any shared or global knowledge-base connection with a task-specific RAG index (e.g. a separate Pinecone namespace or Qdrant collection per workflow) so that documents from one workflow cannot be retrieved in another.
Disable dynamic MCP tool discovery in your agent framework’s production configuration: set the tool registry to a static, hash-pinned manifest reviewed at deployment time, and alert if the manifest changes between deployments.

Threats it governs

When this principle is absent, these threats become reachable.

T2
Tool Misuse Agent uses authorized tools in unintended ways via deceptive prompts or chained calls.
T4
Resource Overload Agents autonomously schedule, queue, and execute work. Exhaustion fans out.
T11
Unexpected RCE and Code Attacks Code-execution paths in agents accept attacker-influenced input and run as arbitrary code.

Controls that advance it

Catalogue mitigations that strengthen this principle, grouped by the defence-in-depth stage they sit in.

Prevent

Tool scope Each tool in an agent's catalog should expose only the methods, resources, and parameter ranges its designated role requires. Over-broad tool surfaces let individually authorised primitives compose into actions no human intended to grant; narrowing the scope at design time reduces both the attack surface and the blast radius of any compromise.
JIT tool grants An agent that holds a persistent catalog of invokable tools can reach any of them at any point in its session. If its reasoning is manipulated or its identity is compromised, that persistent surface is fully available to an attacker. Just-in-time tool grants remove the standing surface: a policy broker issues a time-bound, task-scoped grant immediately before the tool is needed and revokes it automatically when the task completes or the window expires.
Session isolation An agent that serves multiple users stores conversation history, retrieved facts, and intermediate state in a memory layer. If that layer is not scoped to the originating session, one user's writes can reach another user's retrieval path. Session-scoped memory isolation prevents that by enforcing a hard boundary at the storage layer, so each session can only read and write its own state.
Render restriction An agent can include links and rich HTML in its output. When that output is attacker-influenced, a clickable link, embedded image, or rich preview card becomes the delivery mechanism for phishing or data exfiltration via markdown image injection. Rendering restriction removes that delivery vector by allowing clickable content only from an explicit allow-list of trusted domains and reducing everything else to plain text before the output reaches the user.

Detect

No catalogued control.

Respond

No catalogued control.

In Helmwart

Not scored directly; visible on the canvas as the count of tools/edges per agent.