Skip to main content
Agent Engineering

Human-in-the-Loop Checkpoints

Ravinder··6 min read
AgentsAILLMHuman-in-the-LoopUXSafety
Share:
Human-in-the-Loop Checkpoints

Full autonomy is a product decision, not a technical default. The question is not whether to add human checkpoints but where — and how to make them feel like safety nets rather than friction. A poorly designed checkpoint that triggers on every trivial action trains users to click "approve" without reading. A poorly placed checkpoint that triggers too late fails to prevent the damage it was meant to catch.

Anatomy of a Checkpoint

A checkpoint is a pause in the agent loop where a human reviews and approves (or rejects) a pending action before it executes. Every checkpoint needs four things: a trigger condition, a human-readable summary of the pending action, a response interface, and a timeout policy.

flowchart TD A[Agent proposes action] --> B{Checkpoint trigger?} B -- no --> C[Execute action] B -- yes --> D[Notify human\nwith summary] D --> E{Human response} E -- approve --> C E -- reject --> F[Agent replans] E -- timeout --> G{Timeout policy} G -- auto-approve --> C G -- auto-reject --> F G -- escalate --> H[Senior reviewer] C --> I[Log to audit trail] F --> I

The timeout policy is where most implementations make a mistake. Auto-approve on timeout is dangerous (the agent proceeds without consent). Auto-reject on timeout is safe but creates deadlocks for long-running tasks. The right answer depends on the action's reversibility: auto-approve reversible actions, auto-reject irreversible ones.

Trigger Conditions

Not all actions need approval. The trigger condition is the key design decision.

from enum import Enum
from dataclasses import dataclass
 
class ActionRisk(Enum):
    LOW = "low"       # read-only, reversible
    MEDIUM = "medium" # writes, but recoverable
    HIGH = "high"     # irreversible, external effects
 
@dataclass
class AgentAction:
    tool: str
    args: dict
    risk: ActionRisk
    estimated_cost_usd: float = 0.0
    affects_external_systems: bool = False
 
CHECKPOINT_RULES = [
    # Always checkpoint high-risk
    lambda a: a.risk == ActionRisk.HIGH,
    # Checkpoint if external system affected
    lambda a: a.affects_external_systems,
    # Checkpoint if estimated cost above threshold
    lambda a: a.estimated_cost_usd > 5.0,
    # Checkpoint file deletions
    lambda a: a.tool == "delete_file",
    # Checkpoint emails/messages
    lambda a: a.tool in ("send_email", "post_message", "send_sms"),
]
 
def needs_checkpoint(action: AgentAction) -> bool:
    return any(rule(action) for rule in CHECKPOINT_RULES)

This rule-based approach is explicit and auditable. Avoid fuzzy heuristics like "ask the LLM if this action is risky" — you will get inconsistent results and no audit trail for the trigger decision itself.

The Human-Readable Summary

A checkpoint is only useful if the human can understand what they are approving. Raw tool args ({"path": "/var/data/users.csv", "mode": "delete"}) are not enough.

def generate_checkpoint_summary(action: AgentAction, context: dict) -> str:
    """Generate a plain-English summary for human review."""
    templates = {
        "delete_file": (
            "The agent wants to permanently delete the file at `{path}`. "
            "This cannot be undone. Current task: {task_summary}."
        ),
        "send_email": (
            "The agent wants to send an email to {to} with subject '{subject}'. "
            "Preview of body: {body_preview}. Current task: {task_summary}."
        ),
        "write_file": (
            "The agent wants to write to `{path}` ({size} bytes). "
            "Existing file will be overwritten. Current task: {task_summary}."
        ),
    }
    template = templates.get(
        action.tool,
        "The agent wants to call `{tool}` with args: {args}. Current task: {task_summary}."
    )
    return template.format(
        **action.args,
        tool=action.tool,
        args=action.args,
        task_summary=context.get("task_summary", "unknown"),
        body_preview=str(action.args.get("body", ""))[:200],
    )

Include the current task summary in every checkpoint message. Without it, reviewers have no context for why the action is being proposed.

Checkpoint UX Patterns

Inline approval: The agent pauses and the user responds in the same conversation thread. Works well for chat-based agents. Friction is low when done right.

Async queue: The agent queues the action and continues other work. A human reviews a batch of pending approvals. Works for non-blocking workflows where the human is not actively watching.

Email/SMS escalation: For high-stakes irreversible actions, send a notification with a one-click approve/reject link. Use when the stakes warrant the latency.

import asyncio
from typing import Literal
 
CheckpointResponse = Literal["approved", "rejected", "timeout"]
 
async def await_human_approval(
    action: AgentAction,
    summary: str,
    notify_fn,                   # e.g., send_slack, send_email
    timeout_sec: float = 300.0,
    timeout_policy: CheckpointResponse = "rejected",
) -> CheckpointResponse:
    """Pause agent, notify human, wait for response."""
    approval_event = asyncio.Event()
    response: list[CheckpointResponse] = []
 
    def on_response(decision: CheckpointResponse):
        response.append(decision)
        approval_event.set()
 
    # Register response handler (webhook, websocket, polling)
    checkpoint_id = register_checkpoint(action, on_response)
    await notify_fn(summary, checkpoint_id)
 
    try:
        await asyncio.wait_for(approval_event.wait(), timeout=timeout_sec)
        return response[0]
    except asyncio.TimeoutError:
        return timeout_policy

The timeout_policy parameter makes the reversibility decision explicit at the call site, not buried in a config file.

Audit Trail

Every checkpoint decision — approve, reject, timeout — must be logged with who made the decision, when, and what summary they saw. This is non-negotiable for any agent operating on production systems.

import json
from datetime import datetime, timezone
from pathlib import Path
 
class AuditLog:
    def __init__(self, log_path: str):
        self.path = Path(log_path)
        self.path.parent.mkdir(parents=True, exist_ok=True)
 
    def record_checkpoint(
        self,
        task_id: str,
        action: AgentAction,
        summary: str,
        decision: str,
        reviewer: str,
        review_latency_sec: float,
    ):
        entry = {
            "ts": datetime.now(timezone.utc).isoformat(),
            "task_id": task_id,
            "tool": action.tool,
            "risk": action.risk.value,
            "summary_shown": summary,
            "decision": decision,
            "reviewer": reviewer,
            "review_latency_sec": review_latency_sec,
        }
        with self.path.open("a") as f:
            f.write(json.dumps(entry) + "\n")

The summary_shown field is critical: you need to know exactly what the human saw when they made the decision, not just what the action was. If your summary generation has a bug, you can trace back to which approvals were made on bad summaries.

Avoiding Checkpoint Fatigue

The death of a good checkpoint system is alarm fatigue. If 70% of agent actions require approval, users will approve without reading. To prevent this:

  1. Tune triggers continuously — track approval rate. If >20% of actions hit checkpoints, tighten the trigger rules.
  2. Auto-approve patterns — actions the same user has approved 10+ times for the same tool/args pattern are candidates for a trusted-action allowlist.
  3. Batch low-risk checkpoints — instead of one notification per action, batch multiple low-risk pending actions into a single review.

Key Takeaways

  • Checkpoint trigger conditions should be explicit, rule-based, and auditable — not delegated to an LLM's judgment about riskiness.
  • Auto-approve reversible actions on timeout; auto-reject irreversible actions on timeout — make this decision explicit at the call site.
  • Every checkpoint notification must include the current task context, not just the raw action args.
  • Log the exact summary the human saw when approving — not just the action — so you can audit decisions made on bad summaries.
  • Monitor checkpoint approval rate: if >20% of actions require approval, trigger rules are miscalibrated and fatigue will degrade review quality.
  • A trusted-action allowlist (auto-approve repeated identical patterns) is a safe way to reduce friction without removing oversight.
Share: