ASI08: Cascading Failures

Definition

A single low-severity fault (a hallucinated value, a corrupted tool output, a poisoned memory entry) propagates across a network of agents that each build on the last agent's output, compounding into system-wide harm that is disproportionate to the original defect. ASI08 is about propagation and amplification, not the fault's origin; the initial trigger may itself be innocuous.

What it means in practice

A financial-reporting pipeline uses four agents: a data-collection agent, a normalisation agent, a computation agent, and a report-generation agent. The data-collection agent hallucinates a revenue figure, not maliciously but incorrectly. The normalisation agent accepts it as input and builds its transformation on top. The computation agent calculates margins and ratios from the normalised data. The report-generation agent assembles a board-level document. By the time a human reads the report, the original wrong number has propagated through three layers of compounding operations; tracing it back requires re-running every agent in the chain.

The defining property of ASI08 is that the severity of the initial fault is a poor predictor of the eventual harm. A small error compounds at each agent boundary. The architectural response is circuit-breakers, not better entry-point filtering: output validation at every agent boundary (does this agent's output conform to an expected schema and value range?), source attribution (can every figure in the final output be traced to its origin?), anomaly detection on inter-agent traffic (does this value deviate significantly from historical distribution?), and the ability to quarantine a single agent and re-run it without taking the whole workflow offline.

Threat catalogue links

Base-catalog T-numbers follow OWASP source material; normalized MAS scenario entries are Helmwart editorial cross-references. Role colour-codes Helmwart's display weight: chips in the hero use the same scheme.

Primary: strongest pivot. Removing this T-number would gut the entry. Contributing: co-equal mechanism that combines with others to produce the ASI risk. Related: touches the entry but isn't its core; useful cross-reference.

T5 Cascading Hallucination Attacks primary

Fabricated outputs propagate via reflection, memory, or multi-agent comms.
Open threat detail →
T8 Repudiation and Untraceability primary

Agent actions cannot be reliably traced, attributed, or reconstructed.
Open threat detail →
T21 Inconsistent Workflow State primary

Sync delays in shared state across agents cause conflicting actions or partial-workflow completion.
Open threat detail →
T23 Selective Log Manipulation related

Attacker selectively deletes log entries for fraudulent actions while leaving the rest intact.
Open threat detail →
T25 Workflow Disruption via Dependency Exploitation primary

Attacker disrupts agent workflow by attacking a dependent system rather than the agent itself.
Open threat detail →
T26 Model Instability Leading to Inconsistent Blockchain Interactions contributing

Non-deterministic LLM outputs produce unpredictable blockchain transactions, causing fairness and consistency drift.
Open threat detail →
T31 Insufficient Isolation Between Agent Actions contributing

Lack of isolation lets one vulnerability cascade across multiple agent actions.
Open threat detail →
T33 Blockchain Reorganisation Attack (Indirect) primary

Blockchain reorg invalidates previously confirmed agent transactions, breaking downstream state assumptions.
Open threat detail →
T35 Manipulation of Proof of Sampling (PoSP) contributing

Attacker falsifies PoSP verification data, undermining cryptographic sampling-based observability.
Open threat detail →
T41 Schema Mismatch Leading to Errors related

MCP request and server response schemas drift, causing parsing errors or wrong-tool invocation.
Open threat detail →
T44 Insufficient Logging in MCP Server / Client contributing

MCP request and tool-invocation logs are incomplete; forensic reconstruction not possible.
Open threat detail →
T46 Data Residency / Compliance Violation via MCP Server related

MCP server processes data in a jurisdiction or context the data is not authorised to traverse.
Open threat detail →

MITRE ATLAS technique

OWASP has not published a 1:1 MITRE ATLAS mapping for this entry. The closest red-team techniques are referenced on the individual threat detail pages linked in the section above.

OWASP LLM Top 10 cross-references

From OWASP Appendix A (canonical inheritance)

LLM01:2025 Prompt Injection LLM04:2025 Data and Model Poisoning LLM06:2025 Excessive Agency

Helmwart mechanistic crossover (named in OWASP body text, not in Appendix A)

LLM09:2025 Misinformation LLM10:2025 Unbounded Consumption

Recommended mitigations

No single control answers an ASI; it is met by a layered stack. The cards below are ranked by how directly each control counters ASI08: the chips on each card name the threat of this ASI it actually covers, colour-coded by that threat's role.

Counters the core

Cover one or more of this ASI's primary threats — the strongest direct response.

Output provenance tracking — record the source of every claim an agent makes Tier 2

T5T8

When an agent produces a claim derived from retrieved data, that claim needs a record of where it came from: the source document, version, and retrieval time. Without that record, a downstream verifier cannot distinguish a well-grounded output from a fabricated one, a tampered one, or a poisoned one. Provenance tracking attaches source attribution to every claim, carries it through each transformation in the pipeline, and surfaces it in audit logs and user-facing interfaces.

Output egress DLP — inspection gate for PII, secrets, and IP at the agent boundary Tier 2

An agent produces output continuously across multiple channels: user-facing responses, tool-call parameter envelopes, log records, and outbound HTTP requests. Any of those channels can carry sensitive content the agent has retrieved, been fed, or been tricked into including. Output egress DLP places an inspection gate at the boundary so that PII, credentials, and proprietary content are classified and either redacted or quarantined before they leave the trust boundary, regardless of how they got into the output.

Separation of actor and recorder — different identities for action and audit Tier 2

T8T35T44

An agent that writes its own audit log can omit, alter, or suppress any record of its own actions. This is not a theoretical risk: an attacker who controls the acting identity controls the evidence. Actor/recorder separation is the structural fix. The identity that performs an action and the identity that records it are different principals, with non-overlapping permissions, so no single compromise can both execute and erase.

Behavioural anomaly isolation — automatic quarantine on observable drift Tier 2

T33T35

An agent that has been compromised, poisoned, or gone rogue will, in most cases, behave differently from its established baseline. Anomaly isolation acts on that difference: when an agent's behaviour score crosses a configured threshold, it is quarantined automatically, credentials revoked, message-queue access cut, in-flight actions aborted. Manual revocation cannot match the speed that cascading multi-agent failures demand.

Blockchain transaction guard — pre-commit safety checks for every agent-initiated transaction Tier 2

T33T26

A blockchain transaction, once committed, cannot be undone. An agent that signs and broadcasts a transaction without an enforcement layer before it can exceed its authorised value, call a contract it was never provisioned to reach, or drain a wallet in a runaway loop, and by then the funds are gone. A transaction guard intercepts each proposed transaction before signing, checks it against value bounds, a contract allowlist, a gas or compute-unit limit, and a replay-protection nonce, and refuses to sign anything that falls outside declared policy.

Data classification with tool-access allow-lists — a sensitivity label on every dataset, enforced at every access seam Tier 2

Every dataset, document, and external system an agent can reach carries a classification label. The agent's permitted-class set and the tool's permitted-class set are intersected at the moment of every read or write. When the requested data's class falls outside that intersection, access is denied at the seam. This is the data-side complement to least-privilege: it adds a data-sensitivity constraint that role scoping alone does not provide.

Fail-closed gate — refuse rather than act on uncertain output Tier 2

An agent that is uncertain about what to do next faces a choice: refuse and ask for clarification, or proceed on its best guess. In low-stakes situations that tradeoff is tolerable. In agentic systems that write, delete, or send, a confident-sounding but wrong output can commit an irreversible action. A fail-closed gate resolves that choice structurally: below a configured confidence threshold, the agent stops and escalates rather than guessing.

Workflow state consistency — distributed-state integrity checks for multi-agent workflows Tier 3

T21T31

When multiple agents read and write shared workflow state concurrently, a network partition, a delayed message, or an adversarially timed race condition can produce divergent views. An agent acting on stale or conflicting state may authorise an action it would reject given correct current state. Hash-chained state snapshots, merge-point conflict detection, and optimistic concurrency control close that window.

Graceful degradation — fail closed where it matters, fail open where it's safe Tier 2

T25

An agent that encounters a quota trip, a dependency failure, or a timeout faces a choice: continue at reduced quality, or refuse. Getting that choice wrong is the core operational failure. Graceful degradation requires the answer to be declared before the incident, not improvised during it: write-authority paths fail closed and return a refusal; read-only paths fail open and disclose the degraded state explicitly.

Insider threat program — personnel security for operators of high-privilege agentic systems Tier 2

Privileged-access personnel are the human layer behind every agentic system. A person with legitimate administrative credentials can tamper with logs, manipulate approval gates, or extract training data through authorised channels, and no technical control prevents it when the access itself is valid. An insider threat program addresses that gap: it governs who holds operator access, what they agree to, how quickly credentials are revoked on departure, and whether anomalous behaviour is surfaced before damage accumulates.

Legal hold and WORM retention — immutable audit storage that survives a compromised recorder Tier 2

An audit trail is only useful if its records cannot be altered after the fact. Without a storage-layer enforcement mechanism, a sufficiently privileged attacker (or a compromised recorder identity) can overwrite or delete the records that document what happened. Legal hold and WORM retention solve this by placing audit records in storage that the provider itself enforces as immutable: no user, including account root, can modify or delete a locked object within the retention window. Legal hold extends that protection indefinitely for active incidents, lifted only through an out-of-band authority outside the normal operations team.

Multi-agent consensus — N-of-M independent agreement before high-impact actions Tier 2

A single agent's judgment on a high-impact action can be wrong, manipulated, or compromised. Requiring N of M independent peer agents to agree before the action executes means an attacker or a systematic error must affect the quorum majority, not just one agent, before harm results.

Multi-source verification — cross-check factual claims against an independent source before commit Tier 2

An agent that writes a false claim to memory, passes it to a downstream agent, or returns it to a user has introduced an error that each subsequent step may treat as established fact. The cascade depends on one condition: the false claim goes unchallenged. Multi-source verification breaks that condition by requiring every novel factual assertion to be corroborated by a structurally independent source before it is committed. If the second source cannot corroborate the claim, the assertion is refused or down-weighted before it enters any downstream step.

Output moderation gates — independent moderation pass before emission Tier 2

An AI agent can produce output that is harmful, deceptive, or factually wrong while still sounding fluent and confident. Output moderation places an independent classifier or moderation model between the agent and its destination, checking every output before it reaches a user or a downstream system. The generating model does not evaluate its own answer; a separate gate does.

Reflection-loop depth limit — a ceiling on how often an agent reworks its own answer Tier 2

An AI agent can review and rewrite its own answer to improve it. If that review runs too long it ties up resources and stops the agent responding in time, and an attacker can deliberately trigger those endless cycles to stall the system. A reflection-loop depth limit prevents that: it sets how many review rounds an agent may run before it has to stop.

Sigstore signing — cryptographic provenance for agent artifacts and audit records Tier 1

An agent is composed of artifacts produced at different times by different identities: model weights, prompt templates, tool descriptors, MCP server binaries, and audit-log batches. Any of those artifacts can be substituted or tampered with between the moment they are built and the moment they are loaded. Sigstore addresses this by signing each artifact at build time using a short-lived certificate tied to the workload identity that produced it, recording the signature in an append-only public transparency log, and requiring verification against that log before the artifact is loaded or executed.

Broader coverage — 5 controls that address contributing or related threats

Cross-system scope auditing — continuous permission reconciliation Tier 2

T44

An agent that operates across HR, Finance, cloud, and SaaS systems accumulates permissions at each boundary, often without any single team seeing the combined picture. Privilege accumulates silently across those boundaries until a quarterly review finds it, by which point a compromised or misconfigured agent has had weeks of unchecked reach. Cross-system scope auditing prevents that by continuously reconciling the agent's actual entitlements against a declared baseline across every system it touches and raising a ticket the moment drift is detected.

gVisor sandbox — a user-space kernel that intercepts every syscall a container makes Tier 1

T31

When an agent executes generated or retrieved code, that code runs as a process with access to the host kernel. A vulnerability in the generated code, or a deliberate exploit injected through the agent's prompt, can reach the kernel and affect other workloads or the host itself. gVisor prevents this by inserting a user-space kernel implementation between the container and the host: the container's syscalls go to the Sentry process, not to the host kernel, so the reachable attack surface from inside the container is structurally smaller.

MCP response sanitisation — validate and normalise tool outputs before they re-enter the LLM context Tier 2

An MCP server response is content the LLM will reason over next. The model cannot distinguish tool output from instruction: that boundary must be enforced at the client, before the payload enters the context window. MCP response sanitisation applies schema validation, Unicode normalisation, control-token stripping, and structural wrapping to every tool result at the response boundary, so adversarial content embedded in a server response cannot redirect the agent's planner.

Out-of-band verification — independent-channel confirmation for irreversible agent actions Tier 2

T26

An agent that can propose payments, update banking details, or modify production configuration is, by construction, a manipulation surface. If the only thing standing between a proposed change and its execution is the agent's own UI, a successful prompt injection or RAG poisoning attack requires no additional steps. Out-of-band verification breaks that dependency by routing a one-use confirmation code through a channel that is structurally separate from the agent's primary interaction channel, so an attacker who controls the agent's context cannot complete the approval without also compromising the user's registered secondary device.

RBAC and ABAC: role-based and attribute-based access control for agents Tier 2

Role-Based Access Control (RBAC) assigns every agent identity a named role that sets the outer limit on what it can reach. Attribute-Based Access Control (ABAC) narrows individual decisions inside that role by evaluating contextual attributes at request time. Used together, they enforce least privilege for non-human identities: the agent can only do what its role permits, and only when the request attributes satisfy the policy.

OWASP Top 10 for Agentic Applications 2026 (canonical source) ↗ · OWASP Gen AI Security Project · Dec 2025 · CC BY-SA 4.0
Agentic Top 10 side-by-side explainer ↗ · trydeepteam.com · secondary reference