PLAYBOOK · P5 · OWASP Agentic AI v1.1

Protecting HITL & Preventing Decision Fatigue Exploits

Keep human oversight effective when the agent fan-out tries to swamp it.

Goal: Prevent attackers from overloading human decision-makers, manipulating AI intent, or bypassing security through deceptive AI behaviours.

Aligned with Step 5: Does AI require human engagement to achieve its goals or function effectively? · 2 threats mitigated · 21 mitigations referenced

At a glance

THREATS COVERED

T10 · T15

NAVIGATOR STEP

Step 5: Does AI require human engagement to achieve its goals or function effectively?

MITIGATIONS

distinct Helmwart controls referenced across the three phases

Defence-in-depth chain

When reviewer overwhelm or decision-fatigue exploitation arrives, Proactive controls (a risk-prioritised review queue and adaptive workload balancing) preserve reviewer attention by routing low-risk decisions away from humans and throttling notification volume. If a manipulative or high-volume request still reaches a reviewer, Reactive controls (reviewer decision summaries and plan-vs-goal validation) make the decision legible and validate goal consistency before approval. Detective controls (Sigstore signing) produce cryptographic audit trails of reviewer overrides for post-incident review.

proactive Step 1: Optimize HITL workflows & reduce decision fatigue

Score each pending agent action by risk and use that score to rank the HITL review queue so reviewers tackle the highest-impact decisions first.

Helmwart controls: Risk queue Trust score
Automate routine low-risk approvals and escalate only high-impact decisions to a human reviewer.

Helmwart controls: Risk queue Policy bound
Cap the volume of AI-generated notifications per reviewer per hour to prevent cognitive overload.

Helmwart controls: Adaptive load
Enforce rate limits on agent-generated notifications, approval requests, and alerts to block decision-fatigue-as-an-attack-vector.

Helmwart controls: Adaptive load Rate limits and quotas
Require two-agent consensus before any agent is permitted to modify its own operational goals.

Helmwart controls: Peer consensus Goal consistency
Surface a plain-language explanation of each pending decision alongside the agent's recommendation so reviewers can act without re-reading raw logs.

Helmwart controls: Decision summaries
Distribute the review queue across available human reviewers so no single operator is presented with an unmanageable volume.

Helmwart controls: Adaptive load
Strip or sandbox all clickable links and rich content in agent output so social-engineering payloads cannot complete a click-through to the attacker.

Helmwart controls: Render restriction
Provide a documented, tested kill-switch covering single-agent, fleet, and global scopes with a named authority and drill cadence.

Helmwart controls: Kill switch
For irreversible high-stakes changes such as payments or production configuration writes, require confirmation through a channel independent of the agent.

Helmwart controls: OOB verify
For irreversible high-impact actions, require two distinct human reviewers to independently approve and sign off before the agent proceeds.

Helmwart controls: Dual control

reactive Step 2: Identify AI-induced human manipulation

Validate each pending agent plan against its declared goal before approving execution to detect and block unintended behavioural shifts.

Helmwart controls: Goal consistency Plan check
Monitor how frequently each agent requests goal changes and alert when the rate suggests active manipulation rather than legitimate adaptation.

Helmwart controls: Divergence monitor
Scan agent outputs and tool-call parameters for PII, secrets, and sensitive IP before egress so phishing payloads cannot reach the end user.

Helmwart controls: Egress DLP

detective Step 3: Strengthen AI decision traceability & logging

Write every HITL decision and agent recommendation to a cryptographically signed, append-only log to prevent post-hoc tampering.

Helmwart controls: Sigstore Split actor
Run real-time anomaly detection across the agent decision stream and escalate sessions that deviate from expected patterns.

Helmwart controls: Anomaly isolation
Log every human override of an agent recommendation and surface reviewer-pattern analytics to detect bias or systematic misalignment.

Helmwart controls: Cross-system audit Insider program HITL calibration loop
Flag decision reversals in high-risk workflows where a previously rejected AI output was later approved under suspicious conditions.

Helmwart controls: Cross-system audit

Source

OWASP Agentic AI: Threats and Mitigations v1.1 (Dec 2025), §Mitigation Strategies. Action text is taken verbatim or paraphrased from the canonical document; the Helmwart additions are the per-action mappings onto deployable mitigation entries.