Definition
In an agentic system, code generation and code execution happen in the same turn: the model emits an instruction and a tool runs it, with no human review step between. Attackers exploit this by injecting execution payloads into the agent's inputs; the realistic defence is at the runtime boundary (sandboxing, capability restriction, egress control), not at the generation step.
What it means in practice
A classic web app generates code that a human reviews before it runs. An agent generates code and runs it in the same turn. The model emits an instruction (a shell command, a SQL statement, a Python snippet) and a tool executes it. The only thing between generation and impact is whatever sandbox the executing tool runs in.
Filtering generation is hard. LLMs are flexible enough that input sanitisation cannot reliably catch every malicious shape. The realistic containment is at execution: capability-restricted sandboxes, ephemeral filesystems, blocked outbound network, per-tool execution budgets. Assume the agent will eventually generate something bad and design the runtime so the blast radius is bounded.
Threat catalogue links
Base-catalog T-numbers follow OWASP source material; normalized MAS scenario entries are Helmwart editorial cross-references. Role colour-codes Helmwart's display weight: chips in the hero use the same scheme.
-
T11 Unexpected RCE and Code Attacks primary Code-execution paths in agents accept attacker-influenced input and run as arbitrary code.
Open threat detail → -
T20 Framework Vulnerability Leading to Code Injection primary Bug in the agent framework enables code injection into the agent execution context.
Open threat detail →
MITRE ATLAS technique
MITRE ATLAS catalogues adversary techniques against AI systems. The technique(s) below represent the red-team pivot for this entry: what an attacker is actually doing on the wire. Source: mitre-atlas/atlas-data v5.6.0.
AML.T0102 Generate Malicious Commands view on ATLAS ↗ Adversary uses an LLM to dynamically generate malicious commands from natural language, producing attack signatures that vary across executions.
Agentic angle: Agents with code-execution tools can be prompted to generate and immediately run adversary-crafted commands, collapsing generation and execution into one step.
OWASP LLM Top 10 cross-references
From OWASP Appendix A (canonical inheritance)
Recommended mitigations
No single control answers an ASI; it is met by a layered stack. The cards below are ranked by how directly each control counters ASI05: the chips on each card name the threat of this ASI it actually covers, colour-coded by that threat's role.
Counters the core
Cover one or more of this ASI's primary threats — the strongest direct response.
An AI coding agent produces code that can be executed or merged to a production branch without a human ever reading it. If the agent has been manipulated, its generated code can contain hidden payloads, backdoors, or privilege-escalating logic. A code-generation review gate prevents that: every change attributable to an AI agent must pass automated static analysis and receive explicit human approval before it can merge or execute, and the agent identity that authored the change is structurally barred from also approving it.
When an agent executes generated or retrieved code, that code runs as a process with access to the host kernel. A vulnerability in the generated code, or a deliberate exploit injected through the agent's prompt, can reach the kernel and affect other workloads or the host itself. gVisor prevents this by inserting a user-space kernel implementation between the container and the host: the container's syscalls go to the Sentry process, not to the host kernel, so the reachable attack surface from inside the container is structurally smaller.
An AI agent operating with broad authority can propose actions that are irreversible: deleting records, modifying IAM policies, moving funds. A single human reviewer at the approval gate is a single point of failure, one compromised account, one fatigued reviewer, or one successful social-engineering attempt is enough to commit the action. Human dual-control addresses that by requiring two distinct, independent humans to approve before the action commits.
Agentic systems can act faster than a human can intervene through normal channels. A kill switch is the operational guarantee that a named human role can stop agent activity at any scope (single instance, class, or global) through a documented runbook, without requiring a code change or redeployment, and with every invocation written to an audit trail.
An agent produces code, configuration files, tool-call payloads, and log records continuously and at a rate no human reviewer can match. Any of those artefacts may contain a live API key, service token, or private certificate, placed there accidentally through model context, or deliberately through prompt injection or context poisoning. Secret scanning places an inspection gate at every agent output seam: regex patterns match known token formats, entropy analysis detects arbitrary high-entropy strings, and validator calls confirm which candidates are live credentials. The CI-secret-scanning pattern is mature; the agentic specialisation is seam placement, moving the scanner from the repository gate to the agent egress point, where artefacts can be intercepted before they reach any downstream system.
An agent that can generate and execute code treats code generation as a tool call and code execution as the outcome. If the generated code contains a known-dangerous pattern, no amount of prompt engineering stops it from running once the execute call goes through. Static analysis closes that gap: it scans every code artifact the agent emits against a rule set before execution is permitted, catching the vulnerability patterns the same tooling already catches in human-written code.
Sources
- OWASP Top 10 for Agentic Applications 2026 (canonical source) ↗ · OWASP Gen AI Security Project · Dec 2025 · CC BY-SA 4.0
- Agentic Top 10 side-by-side explainer ↗ · trydeepteam.com · secondary reference