05 · CASE STUDIES

Case studies three worked threat models

The OWASP MAS Threat Modelling Guide walks three real-world agentic systems end-to-end through MAESTRO. Helmwart surfaces them here with their per-layer mapping, 34 extended threats beyond T1–T17, and cross-layer scenarios. Each study links to a matching canvas template you can open and edit.

A threat catalog tells you what can go wrong in theory; a case study shows you what went wrong, and what was actually at risk, in a concrete system. Each of the three studies here follows the same structure: a layer-by-layer MAESTRO mapping of the real architecture, the baseline OWASP threats (T1–T17) that apply directly, and the extended multi-agent threats (T18–T49) that emerge from how the components interact. Reading a case study in full takes around twenty minutes; it is the fastest way to develop intuition for how threats cluster rather than appear in isolation.

How to use them: treat each study as a calibration exercise. Before opening the detail page, sketch the system yourself. What data does each component touch, which agents can call which tools, where are the trust boundaries? Then compare your analysis with Helmwart's layer mapping. The canvas template linked on each card lets you open the pre-populated threat model and edit it against your own variant of the architecture.

§3 10 extended threats

RPA Expense Reimbursement

Robotic-process-automation agent that extracts, validates, and routes employee expense claims.

A single-agent Robotic Process Automation (RPA) system that automates the full employee expense-reimbursement lifecycle: an LLM reads submitted receipts and forms, decides whether each claim satisfies company policy, and either routes it for payment or flags it for a human reviewer. "RPA" here means software that mimics what a back-office clerk would do (open emails, read attachments, fill in fields, call financial APIs) but driven by an LLM rather than hard-coded scripts. That shift from deterministic scripts to probabilistic reasoning is what makes the threat landscape fundamentally different. The agent holds live service-account credentials to financial systems, writes to an audit log, and can send emails, giving it a wide blast radius if it is manipulated or behaves unexpectedly.

Baseline threat numbers: T1T2T3T6T7T8T10 +2

Open case study → Open on canvas →

§4 13 extended threats

ElizaOS: Web3 agent operating system

TypeScript-based agent OS for autonomous AI agents on Solana and other blockchains.

ElizaOS is an open-source TypeScript framework for building and running autonomous AI agents: think of it as a runtime that lets you give an LLM a persistent identity, memory, social-media accounts, and a live cryptocurrency wallet, then leave it to operate on its own. Agents are defined by "character files" (JSON configurations specifying personality, knowledge, and platform connections) and can post to Twitter/X, reply in Discord, answer Telegram messages, and execute Solana transactions, all without human approval for each action. The blockchain integration is the sharpest edge: each ElizaOS agent can hold real assets, sign transactions, and interact with DeFi smart contracts autonomously. Proof of Sampling (PoSP) is the framework's mechanism for cryptographically attesting that a given LLM inference actually ran, which is important for trust in a decentralised ecosystem where agents may never meet their operators. The attack surface is correspondingly wide: a compromised agent is not just a chatbot but a funded, autonomous actor with live accounts on multiple platforms.

Baseline threat numbers: T1T2T5T11T13

Open case study → Open on canvas →

§5 11 extended threats

Anthropic Model Context Protocol (MCP)

Open client/server protocol connecting AI applications to data sources and tools.

The Model Context Protocol (MCP) is an open standard, originally developed by Anthropic, for connecting AI models to external data sources and executable tools in a uniform way. Before MCP, every AI application wired up integrations (web search, file access, database queries, code execution) through bespoke, incompatible connectors. MCP defines a single JSON-RPC message format so that any MCP-aware model host (Claude Desktop, VS Code Copilot, or a custom agent runtime) can speak to any MCP server without custom glue code. The architecture has three roles: the Host (the application the user interacts with, e.g. Claude Desktop), the Client (the in-process component that manages connections and routes calls), and the Server (a lightweight HTTP process that exposes three primitive types: Tools for callable functions, Resources for data the model can read, and Prompts for reusable instruction templates). The security significance: MCP servers frequently hold access to filesystems, databases, internal APIs, and cloud services. The model decides autonomously when to call them. A malicious or misconfigured server is therefore not just a data-leak risk; it is a direct execution path into whatever systems the server is authorised to touch.

Baseline threat numbers: T2T11T12T13T17

Open case study → Open on canvas →

Source: OWASP MAS Threat Modelling Guide v1.0 (Apr 2025), §3 RPA · §4 ElizaOS · §5 Anthropic MCP. Layer mappings, threat names, and example scenarios are taken from the canonical document; the Helmwart addition is the per-extended-threat link back to the closest base threat number and the matching canvas template.