06 · HITL PROGRAM

Human oversight and escalation where review is required, and where it is not

A human-oversight program can include bounded autonomous actions, sampled audit, single-review approval, and dual-control approval. HITL applies only when a human must approve an action before it commits. This page shows how to choose that route and which controls implement it.

When must a human approve?

Route by consequence and reversibility, not by confidence alone. Confidence, moderation, and novelty signals may influence the route; they do not prove an action is safe.

01
No HITL

Autonomous action with sampled oversight

Bounded and reversible. Draft a reply, classify a ticket, or issue a refund below an approved low-value limit.

02
1 approver

Single-review HITL

Material but reversible. Send a customer response, change workflow state, or approve a moderate-value refund.

03
2 approvers

Dual-control HITL

High-impact or irreversible. Release funds, delete records, grant privileges, or transfer regulated data.

04
Stop

Quarantine or refuse

Unsafe or unevaluable. Flagged output, missing evidence, policy lookup failure, or a saturated review queue. No action commits.

Flow diagram

Each proposed action is screened and assigned a route. Low-risk bounded actions may proceed under sampled oversight; medium- and high-impact actions enter true HITL before commit. Unsafe actions exit as refusals or quarantine. Hover over a box for a plain-language explanation; select it to open the control that implements that step.

The agent has proposed an action but has not performed it yet. ACTION REQUESTED nothing has happened yet If required information is missing or unclear, stop instead of guessing. Click to open the fail-closed control. 1 · CHECK EVIDENCE stop if the basis is missing refuse Required information is missing, so the agent refuses to act. Click to open the fail-closed control. DO NOT ACT missing evidence pass Check whether the proposed output breaks safety or business rules. Click to open the moderation control. 2 · CHECK RULES + SAFETY block prohibited or unsafe output flag Hold flagged content for review instead of allowing it to proceed. Click to open the moderation control. HOLD FOR REVIEW possible unsafe output clean Decide whether no person, one person, or two people must approve, based on possible harm and whether the action can be undone. Click for the control. 3 · CHOOSE APPROVAL m-risk-prioritized-queue possible harm · can it be undone? · access needed Low-impact actions that can be undone may proceed without prior human approval, with sample checks later. Click for the control. NO APPROVAL NEEDED m-risk-prioritized-queue Show the reviewer the proposed action, evidence, and rule-check result before they decide. Click for the control. 4 · EXPLAIN TO REVIEWER m-decision-summary One person reviews the summary and must approve before the action executes. Click for the reviewer-summary control. ONE PERSON APPROVES m-decision-summary High-impact actions require two people to approve before execution. Click for the dual-control approval control. TWO PEOPLE APPROVE m-human-dual-control Log the route and any approval so the decision can be checked later. Click for the signed-record control. RECORD WHAT HAPPENED m-sigstore · recorded outcome For user-facing actions, tell the user that AI played a role. Click for the disclosure control. TELL USER AI WAS INVOLVED m-ai-disclosure-ui · conditional The action is now allowed to go ahead and its route has been recorded. ACTION GOES AHEAD

The escalation flow — step detail

Every proposed action is routed before execution. Only routes labelled HITL require a human approval before the action can commit.

  1. 1
    Agent proposes an action without executing it yet. The proposal includes the action, target resource, requested authority, and supporting evidence.
  2. 2
    Routing signals are collected — confidence estimates, evidence completeness, policy lookup status, novelty, and action consequence. Low confidence or missing evidence can force refusal or review, but high confidence never authorises a high-impact action by itself. Implementation: fail-closed refusal.
  3. 3
    Policy and moderation checks run — block prohibited output, policy violations, or unsafe content before routing an executable action. Implementation: output moderation gates.
  4. 4
    Choose an execution route — use consequence, reversibility, authority, data sensitivity, and novelty to route the proposal to bounded autonomous action, single-review HITL, dual-control HITL, or refusal. Implementation: risk-prioritised review queue.
  5. 5
    For HITL routes, create the review package before review — present the action, evidence chain, relevant policy result, confidence signal, and risk reason in a structured card. Implementation: reviewer decision summaries.
  6. 6
    The assigned route determines human involvement:
    • Bounded autonomous action — no HITL event; action is logged and sampled for oversight.
    • Single-review HITL — one human approves a material but reversible action.
    • Dual-control HITL — two humans approve high-impact or irreversible action. Implementation: dual-control approval.
  7. 7
    Assign HITL work to an available reviewer — track queue depth and fatigue indicators; do not silently downgrade a required review when human capacity is unavailable. Implementation: adaptive workload balancing.
  8. 8
    Allowed actions are signed and logged — record the autonomous route or human approval, actor, policy version, and time. Implementation: Sigstore keyless signing + separation of actor and recorder.
  9. 9
    User-facing actions disclose the AI role where applicable — labelling is relevant for communications or interactions, not every internal workflow transition. Implementation: AI-source disclosure UI.
  10. 10
    Action commits or refuses — reviewer decisions, sampled autonomous outcomes, and reversals feed calibration without bypassing required approvals. Implementation: HITL feedback-loop calibration.

HITL flow controls at a glance

Design principles

When HITL is unavailable

Off-hours, surges, and vendor outages mean HITL queues do not always have a reviewer. The program must declare in advance what happens when the queue saturates:

  • Fail closed by default. Action does not commit; the agent returns a refusal with the saturation reason in the audit trail. Per the fail-closed refusal pattern.
  • Queue-depth alarms feed back into the agent's rate limits. If reviewers are slammed, the agent should slow itself. See rate limits and quotas.
  • Capacity limits are explicit. Declare reviewer-hours per shift, max queue depth per tier, and the SLA at which the program degrades. These belong in the runbook, not the source.

Feedback loop into agent calibration

Reviewer overrides and decision reversals are captured as signals in reviewer decision summaries and risk-prioritised queues. HITL feedback-loop calibration closes the loop: override events are batched, analysed for systematic patterns, and fed back into agent calibration: prompt updates, tool-scope policy changes, and divergence-monitor threshold tuning. Each calibration cycle requires human sign-off on the pattern report before any agent change is deployed.

Policy framing

This HITL program is an engineering interpretation related to the ACM Europe Technology Policy Committee's May 2025 policy brief (see Governance primer) and its proposal for alignment oversight. It is not text of Article 14 or of the brief. Helmwart applies the concept by making agent actions legible to reviewers; the linked flow controls, logging, and calibration measures provide that evidence.