T41: Schema Mismatch Leading to Errors

Definition

The MCP schema (which defines the structure of resources, tool parameters, and interactions) is ambiguous or inconsistently implemented between client and server. This leads to misinterpretation of data, incorrect tool invocations, and silent data corruption. The non-determinism arises from protocol ambiguity rather than from the LLM’s inherent stochasticity.

What it looks like in practice

An MCP server defines a date field as a string without specifying the expected format. One MCP client sends dates as YYYY-MM-DD; a second client sends them as MM/DD/YYYY. The server parses both as strings and passes them to a downstream date-comparison function without format normalisation. The function misinterprets 05/14/2026 as the 14th of May when the server’s locale expects day-first, producing an off-by-one-month error in all queries from the second client. The first client’s queries succeed; the second client’s queries silently return incorrect results. Both clients and the server report no errors.

The cross-layer scenario “Schema Ambiguity Cascading Through Multi-Client Server” from the MAS Guide shows how T41 feeds into T42: the incorrect data written to the shared server state by the second client then corrupts the data read by the first client.

Why it’s dangerous in multi-agent context

MCP schemas are consumed by multiple independent client implementations connecting to multiple servers. An ambiguous schema field is interpreted differently by different clients, producing divergent behaviour that is not traceable to any single error. Each component behaves consistently with its own interpretation of the schema. When agents act autonomously on server responses, parsing errors and misinterpretations are not surfaced to a human; they manifest directly as incorrect downstream actions. In a multi-agent deployment where agents feed each other’s outputs as inputs, a schema mismatch in one agent’s interaction with an MCP server propagates through the agent chain.

Detection signals

Schema mismatch is silent by design (no party throws an error), so detection depends on surfacing the divergence at the data layer before it reaches a downstream decision.

A date or numeric field value parsed from a client message that falls outside the plausible range for the field’s stated purpose (e.g., a date field producing a month value greater than 12 after format normalisation): add a semantic range validator on the server before the value is passed to any downstream function.
Two clients submitting semantically identical queries but receiving results that differ by more than a defined variance threshold: periodically issue cross-client consistency probes and compare outputs.
A downstream date-comparison, currency-conversion, or arithmetic function returning results that are inconsistent with the raw field values logged at request time: reconcile processed output against the raw request log to identify where the divergence was introduced.
An increase in schema validation rejection rate after a schema update, concentrated on a specific client or client version. A spike in per-client validation failures in the MCP server’s request log is a direct indicator of a mismatched schema version.
A shared server state field being written with values from two different lexical formats in the same time window (e.g., both 2026-05-14 and 05/14/2026 appearing in the same database column): a format-diversity check on append-only fields surfaces client-side divergence.

Mitigations

Specify all schema fields with explicit types, formats, and enumeration lists; prohibit under-specified fields (strings without format, optional fields without documented default behaviour) in production schemas.
Enforce schema validation on both request and response payloads at the client and server; reject non-conforming messages rather than attempting to normalise them.
Adopt a schema-first development discipline: generate client and server stubs from a canonical schema definition (for example, using JSON Schema or OpenAPI) to eliminate divergent implementations.
Run schema conformance tests across all client-server combinations before deploying schema changes to production.

Relation to base threat (T1–T17)

T41 extends T5 Cascading Hallucination Attacks. Where T5 addresses the propagation of fabricated LLM outputs through agent-to-agent communication, T41 addresses the propagation of protocol-ambiguity-driven data corruption through the MCP client-server interaction. The source of non-determinism is schema ambiguity rather than model stochasticity. T42 (Cross-Client Interference via Shared Server) is the direct downstream consequence when T41’s corrupt data reaches a shared server state.

OWASP Top 10 for Agentic Applications 2026

The Agentic Top 10 (ASI01 through ASI10) is a separate practitioner-facing publication that maps onto the master Threats & Mitigations threat numbering. T41 is covered by the following Top 10 entries:

ASI08 Cascading Failures related

A single low-severity fault (a hallucinated value, a corrupted tool output, a poisoned memory entry) propagates across a network of agents that each build on the last agent's output, compounding into system-wide harm that is disproportionate to the original defect. ASI08 is about propagation and amplification, not the fault's origin; the initial trigger may itself be innocuous.

OWASP LLM Top 10: LLM01:2025 LLM04:2025 LLM06:2025

Source: OWASP Top 10 for Agentic Applications 2026 (Dec 2025) · the Top 10 is a compass into the master Threats & Mitigations taxonomy, not a replacement for it.

Design principles at stake

When T41 is present, these security design principles are the ones being violated or tested. Each links to the full principle; the mitigations below are how you restore them.

Defence-in-Depth Schema ambiguity produces silent corruption rather than an error: each component behaves correctly by its own interpretation, so there is no single failure to catch. Depth means the schema itself is the first layer: explicit types, enumeration lists, and format constraints specified in a canonical JSON Schema or OpenAPI definition leave no ambiguity for clients to interpret differently. The second layer is enforcement: both client and server validate every payload on receipt and reject non-conforming messages rather than attempting to normalise them. Only when both layers are present does a mismatch surface as a hard error before incorrect data reaches downstream agents.
Constrained Generation & Deterministic Guardrails The vulnerability here is not in what the model generates but in what the protocol accepts as valid: an under-specified string field silently accommodates two incompatible date formats and passes both onward without complaint. Constrained generation's answer is to treat the schema as the deterministic gate: typed fields with explicit format constraints and enumeration lists mean there is exactly one valid form for any value, and a schema-first development workflow that generates client and server stubs from a canonical definition ensures both sides are bound to the same gate. A non-conforming payload is rejected at the boundary rather than silently reinterpreted downstream.

Recommended mitigations

Auto-generated from the mitigation catalog: every mitigation whose coverage map includes T41, sorted by maturity tier (Tier 1 production-canonical first, then Tier 2, then Tier 3 research-stage).

Tier 2 Egress DLP (Output egress DLP — inspection gate for PII, secrets, and IP at the agent boundary)

An agent produces output continuously across multiple channels: user-facing responses, tool-call parameter envelopes, log records, and outbound HTTP requests. Any of those channels can carry sensitive content the agent has retrieved, been fed, or been tricked into including. Output egress DLP places an inspection gate at the boundary so that PII, credentials, and proprietary content are classified and either redacted or quarantined before they leave the trust boundary, regardless of how they got into the output.

why it helps Anomalously-shaped outbound payloads can carry malformed or out-of-schema data into downstream systems. DLP inspection at the egress seam validates the structural shape of outbound payloads and flags outputs that deviate from expected schema before they reach the target system.
Tier 2 Fail-closed (Fail-closed gate — refuse rather than act on uncertain output)

An agent that is uncertain about what to do next faces a choice: refuse and ask for clarification, or proceed on its best guess. In low-stakes situations that tradeoff is tolerable. In agentic systems that write, delete, or send, a confident-sounding but wrong output can commit an irreversible action. A fail-closed gate resolves that choice structurally: below a configured confidence threshold, the agent stops and escalates rather than guessing.

why it helps Schema mismatch between an MCP server and an agent produces action proposals the agent cannot validate. A fail-closed gate requires the agent to refuse any response it cannot parse unambiguously, rather than proceeding with an uncertain interpretation of a malformed payload.
Tier 2 MCP sanitisation (MCP response sanitisation — validate and normalise tool outputs before they re-enter the LLM context)

An MCP server response is content the LLM will reason over next. The model cannot distinguish tool output from instruction: that boundary must be enforced at the client, before the payload enters the context window. MCP response sanitisation applies schema validation, Unicode normalisation, control-token stripping, and structural wrapping to every tool result at the response boundary, so adversarial content embedded in a server response cannot redirect the agent's planner.

why it helps Schema Mismatch Leading to Errors arises when a server returns a response whose types diverge from the declared tool contract. Enforcing the MCP response shape and any declared outputSchema at the sanitisation boundary catches ambiguously-typed payloads before they reach the planner, preventing downstream actions driven by parsing divergence.

Red-team pivot: MITRE ATLAS techniques

MITRE ATLAS catalogues adversary techniques against AI systems. Where this OWASP threat has an attacker-perspective counterpart, the ATLAS technique is shown below. That is what a red team would actually be doing on the wire. Use this for detection-signal anchoring, threat-hunting hypotheses, and IR runbooks. Source: mitre-atlas/atlas-data v5.6.0.

AML.T0067 LLM Trusted Output Components Manipulation view on ATLAS ↗

Adversary manipulates the structured parts of an LLM response (citations, tool-call arguments, approved-action markup) that downstream systems treat as trusted.

Agentic angle: Structured outputs are exactly what agent frameworks parse to decide what to execute. Undermining the structure undermines every safety check downstream.

AML.T0031 Erode AI Model Integrity view on ATLAS ↗

Adversary degrades model output quality over time so users lose confidence or downstream consumers act on incorrect predictions.

References

OWASP MAS Threat Modelling Guide v1.0 (April 2025) §5 Anthropic MCP — Layer 3 Agent Frameworks.

Sources

OWASP-MAS-Guide ↗ · 1.0 (Apr 2025) · §5 Anthropic MCP — Layer 3 Agent Frameworks