T39 · Helmwart ID · OWASP MAS Guide source

Unintended Resource Consumption via MCP

Extends T4: Resource Overload · base threat in OWASP v1.1 catalog

Last reviewed 2026-05-14 · Severity heuristic: high

Definition

An agent acting autonomously uses the Model Context Protocol (MCP) to repeatedly access resources or invoke tools, exhausting the MCP server or its connected systems: CPU, memory, network bandwidth, and API call quotas all become depleted. The denial of service is driven by the agent’s own autonomous behaviour rather than by an external flood.

What it looks like in practice

An agent is designed to monitor a website for content changes by fetching it via an MCP tool every five minutes. A misconfiguration removes the polling interval guard, causing the agent to re-enter the fetch loop immediately after each response. The agent issues hundreds of fetch requests per minute through the MCP server. The target website’s rate-limiter blocks the MCP server’s IP after reaching its request ceiling; the MCP server begins queuing requests until its own memory is exhausted; other clients connected to the same MCP server start receiving timeouts and errors.

A second variant: the agent is processing a large batch of items and issues a separate MCP tool call for each item without batching. The tool calls accumulate on the MCP server faster than they can be processed; the server’s request queue grows without bound.

Why it’s dangerous in multi-agent context

MCP standardises and simplifies tool and resource access for agents. An agent with access to an MCP server can invoke tools at machine speed, without the natural throttle of human interaction. The MCP server and the resources behind it have no inherent knowledge of whether a request rate is intentional or pathological. T42 (Cross-Client Interference via Shared Server) compounds the risk: a single runaway agent that exhausts the MCP server’s capacity denies service to all other clients connected to the same server instance simultaneously.

Detection signals

Runaway MCP consumption is measurable at both ends of the connection: the server accumulates a request backlog while the target resource starts returning rate-limit errors.

  • The MCP server’s inbound request queue depth rising above a sustained threshold (e.g., more than 50 queued requests from a single client): queue depth is typically exposed as a server metric; set an alert at 2× the expected per-client peak.
  • A single agent client’s MCP request rate exceeding its 7-day rolling average by more than a configured multiple (e.g., 10×) within any 60-second window: compare per-client request counters against historical baseline in the MCP server’s access log.
  • HTTP 429 (Too Many Requests) responses from the downstream resource increasing from zero to a sustained rate within a short window: track 429 frequency per MCP tool target and alert on any upward step change.
  • The agent’s own iteration counter in its trace log incrementing without any terminal condition being evaluated between iterations. A loop depth metric with no corresponding exit-path event is a direct loop-escape signal.
  • MCP server memory utilisation climbing steadily without levelling off over a 10-minute window. A monotonic memory growth curve on the server host is consistent with an unbounded request queue.

Mitigations

  • Enforce explicit rate limits and maximum-iteration guards on all agent loops that issue MCP resource accesses; treat missing rate limits as a blocking defect.
  • Configure the MCP server to enforce per-client rate limiting independent of client-side controls.
  • Implement idempotency checks before re-issuing an MCP resource request: verify that the previous request did not already succeed before sending a new one.
  • Alert when an agent’s MCP request rate exceeds its baseline by more than a defined factor; automatically pause the agent and require human confirmation to resume.

Relation to base threat (T1–T17)

T39 extends T4 Resource Overload. Where T4 addresses resource exhaustion driven by external attack, T39 is self-inflicted: the agent’s own loop logic is the source of the overload, mediated through the MCP tool invocation surface. T32 (Runaway Agent on Solana) is the blockchain-transaction analogue: the same runaway pattern expressed through on-chain transaction submissions rather than MCP tool calls.

OWASP Top 10 for Agentic Applications 2026

The Agentic Top 10 (ASI01 through ASI10) is a separate practitioner-facing publication that maps onto the master Threats & Mitigations threat numbering. T39 is covered by the following Top 10 entries:

  • ASI02 Tool Misuse and Exploitation contributing

    An agent applies authorised tools in ways their operator did not intend, driven by prompt injection, misaligned reasoning, or manipulated tool outputs. Every individual call looks clean; the harm is in the sequence: data exfiltrated via successive reads, workflows hijacked by parameter tampering, or a legitimate API weaponised across turns.

    OWASP LLM Top 10: LLM06:2025

Source: OWASP Top 10 for Agentic Applications 2026 (Dec 2025) · the Top 10 is a compass into the master Threats & Mitigations taxonomy, not a replacement for it.

Design principles at stake

When T39 is present, these security design principles are the ones being violated or tested. Each links to the full principle; the mitigations below are how you restore them.

  • Defence-in-Depth The agent issues hundreds of fetch requests per minute through the MCP server because the polling interval guard was removed by misconfiguration: a single client-side control that, once absent, leaves the server with no backstop. Client-side loop guards alone are insufficient because they can be misconfigured, bypassed, or absent entirely. Depth means explicit rate limits and maximum-iteration guards on all agent loops as the first layer, MCP server-side per-client rate limiting that is independent of client-side controls so that a misconfigured client cannot exhaust server capacity regardless, idempotency checks before re-issuing an MCP resource request so that a retry loop cannot re-request a resource that already succeeded, and automated alerting and agent pause when request rate exceeds baseline by more than a defined factor: each layer stopping the runaway at a different point in the call chain.
  • Rate-limiting / Budgets / Loop prevention The MCP context removes the natural throttle of human interaction, allowing the agent to invoke tools at machine speed with no inherent knowledge of whether the request rate is intentional or pathological. Client-side polling guards are the single point of failure: their removal by misconfiguration produces the unbounded loop. The mitigations require two independent rate controls: a client-side maximum-iteration guard enforced as a blocking defect, and a server-side per-client rate limit that fires independently of client state, so that either layer alone is sufficient to contain the runaway. An alerting threshold that automatically pauses the agent when MCP request rate exceeds baseline provides the detective and response capability that neither rate limit alone guarantees.

Recommended mitigations

Auto-generated from the mitigation catalog: every mitigation whose coverage map includes T39, sorted by maturity tier (Tier 1 production-canonical first, then Tier 2, then Tier 3 research-stage).

  • Tier 2 Kill switch (Kill switch: human authority to halt one agent, a class, or the entire deployment)

    Agentic systems can act faster than a human can intervene through normal channels. A kill switch is the operational guarantee that a named human role can stop agent activity at any scope (single instance, class, or global) through a documented runbook, without requiring a code change or redeployment, and with every invocation written to an audit trail.

    why it helps Unintended MCP resource consumption can exhaust per-agent or per-tenant tool-call quotas before any automatic limit fires. A fleet-level kill-switch invocation halts all tool invocations from the consuming agent fleet, bounding the consumption while the root cause is investigated.

  • Tier 2 Rate limits and quotas (Per-agent rate limits and quotas — bound compute, tokens, and external-API spend)

    An agent operates without direct human oversight, autonomously scheduling tool calls, external API requests, and reflection loops. Without a budget, a single triggering event can fan out into hundreds of downstream calls. Per-agent rate limits and quotas assign each agent identity its own ceiling on call rate, token consumption, and cost spend, so a misbehaving or compromised agent cannot exhaust shared resources and its overconsumption becomes a visible, actionable signal.

    why it helps Unintended MCP resource consumption occurs when an agent or a manipulated tool call exhausts compute, API quota, or cost budgets that were never intended to be reachable in normal operation. Per-agent, per-tool, and per-session quota enforcement places a hard ceiling on that consumption at each layer.

  • Tier 2 Tool scope (Least-privilege tool scoping — a hard boundary on what each tool exposes)

    Each tool in an agent's catalog should expose only the methods, resources, and parameter ranges its designated role requires. Over-broad tool surfaces let individually authorised primitives compose into actions no human intended to grant; narrowing the scope at design time reduces both the attack surface and the blast radius of any compromise.

    why it helps An agent that lacks scope to invoke a resource-intensive tool cannot trigger it, regardless of whether it is prompted or manipulated to attempt the call. Unintended MCP resource consumption requires the agent to have scope first; this control removes that prerequisite.

Red-team pivot: MITRE ATLAS techniques

MITRE ATLAS catalogues adversary techniques against AI systems. Where this OWASP threat has an attacker-perspective counterpart, the ATLAS technique is shown below. That is what a red team would actually be doing on the wire. Use this for detection-signal anchoring, threat-hunting hypotheses, and IR runbooks. Source: mitre-atlas/atlas-data v5.6.0.

AML.T0034.002 Agentic Resource Consumption view on ATLAS ↗

Adversary coerces an agent into performing expensive tool calls (excessive API queries, fan-outs, or recursive self-delegation loops) to waste compute and API budgets.

Agentic angle: Prompt injection directives like "summarize 1000 times" or recursive sub-agent spawning can burn budgets in a single task.

AML.T0029 Denial of AI Service view on ATLAS ↗

Adversary exhausts compute, memory, or rate-limit budgets so the AI system stops responding or stops processing legitimate requests.

AML.T0053 AI Agent Tool Invocation view on ATLAS ↗

Adversary causes an agent to invoke a legitimate tool with attacker-controlled parameters, turning a sanctioned capability into an attack vector.

Agentic angle: Maps directly to OWASP T2 Tool Misuse: the agent's tools are operating within their declared scope, but the chosen invocation is unsafe.

References

Sources