Human-in-the-Loop Checkpoints
Series
Agent EngineeringFull autonomy is a product decision, not a technical default. The question is not whether to add human checkpoints but where — and how to make them feel like safety nets rather than friction. A poorly designed checkpoint that triggers on every trivial action trains users to click "approve" without reading. A poorly placed checkpoint that triggers too late fails to prevent the damage it was meant to catch.
Anatomy of a Checkpoint
A checkpoint is a pause in the agent loop where a human reviews and approves (or rejects) a pending action before it executes. Every checkpoint needs four things: a trigger condition, a human-readable summary of the pending action, a response interface, and a timeout policy.
The timeout policy is where most implementations make a mistake. Auto-approve on timeout is dangerous (the agent proceeds without consent). Auto-reject on timeout is safe but creates deadlocks for long-running tasks. The right answer depends on the action's reversibility: auto-approve reversible actions, auto-reject irreversible ones.
Trigger Conditions
Not all actions need approval. The trigger condition is the key design decision.
from enum import Enum
from dataclasses import dataclass
class ActionRisk(Enum):
LOW = "low" # read-only, reversible
MEDIUM = "medium" # writes, but recoverable
HIGH = "high" # irreversible, external effects
@dataclass
class AgentAction:
tool: str
args: dict
risk: ActionRisk
estimated_cost_usd: float = 0.0
affects_external_systems: bool = False
CHECKPOINT_RULES = [
# Always checkpoint high-risk
lambda a: a.risk == ActionRisk.HIGH,
# Checkpoint if external system affected
lambda a: a.affects_external_systems,
# Checkpoint if estimated cost above threshold
lambda a: a.estimated_cost_usd > 5.0,
# Checkpoint file deletions
lambda a: a.tool == "delete_file",
# Checkpoint emails/messages
lambda a: a.tool in ("send_email", "post_message", "send_sms"),
]
def needs_checkpoint(action: AgentAction) -> bool:
return any(rule(action) for rule in CHECKPOINT_RULES)This rule-based approach is explicit and auditable. Avoid fuzzy heuristics like "ask the LLM if this action is risky" — you will get inconsistent results and no audit trail for the trigger decision itself.
The Human-Readable Summary
A checkpoint is only useful if the human can understand what they are approving. Raw tool args ({"path": "/var/data/users.csv", "mode": "delete"}) are not enough.
def generate_checkpoint_summary(action: AgentAction, context: dict) -> str:
"""Generate a plain-English summary for human review."""
templates = {
"delete_file": (
"The agent wants to permanently delete the file at `{path}`. "
"This cannot be undone. Current task: {task_summary}."
),
"send_email": (
"The agent wants to send an email to {to} with subject '{subject}'. "
"Preview of body: {body_preview}. Current task: {task_summary}."
),
"write_file": (
"The agent wants to write to `{path}` ({size} bytes). "
"Existing file will be overwritten. Current task: {task_summary}."
),
}
template = templates.get(
action.tool,
"The agent wants to call `{tool}` with args: {args}. Current task: {task_summary}."
)
return template.format(
**action.args,
tool=action.tool,
args=action.args,
task_summary=context.get("task_summary", "unknown"),
body_preview=str(action.args.get("body", ""))[:200],
)Include the current task summary in every checkpoint message. Without it, reviewers have no context for why the action is being proposed.
Checkpoint UX Patterns
Inline approval: The agent pauses and the user responds in the same conversation thread. Works well for chat-based agents. Friction is low when done right.
Async queue: The agent queues the action and continues other work. A human reviews a batch of pending approvals. Works for non-blocking workflows where the human is not actively watching.
Email/SMS escalation: For high-stakes irreversible actions, send a notification with a one-click approve/reject link. Use when the stakes warrant the latency.
import asyncio
from typing import Literal
CheckpointResponse = Literal["approved", "rejected", "timeout"]
async def await_human_approval(
action: AgentAction,
summary: str,
notify_fn, # e.g., send_slack, send_email
timeout_sec: float = 300.0,
timeout_policy: CheckpointResponse = "rejected",
) -> CheckpointResponse:
"""Pause agent, notify human, wait for response."""
approval_event = asyncio.Event()
response: list[CheckpointResponse] = []
def on_response(decision: CheckpointResponse):
response.append(decision)
approval_event.set()
# Register response handler (webhook, websocket, polling)
checkpoint_id = register_checkpoint(action, on_response)
await notify_fn(summary, checkpoint_id)
try:
await asyncio.wait_for(approval_event.wait(), timeout=timeout_sec)
return response[0]
except asyncio.TimeoutError:
return timeout_policyThe timeout_policy parameter makes the reversibility decision explicit at the call site, not buried in a config file.
Audit Trail
Every checkpoint decision — approve, reject, timeout — must be logged with who made the decision, when, and what summary they saw. This is non-negotiable for any agent operating on production systems.
import json
from datetime import datetime, timezone
from pathlib import Path
class AuditLog:
def __init__(self, log_path: str):
self.path = Path(log_path)
self.path.parent.mkdir(parents=True, exist_ok=True)
def record_checkpoint(
self,
task_id: str,
action: AgentAction,
summary: str,
decision: str,
reviewer: str,
review_latency_sec: float,
):
entry = {
"ts": datetime.now(timezone.utc).isoformat(),
"task_id": task_id,
"tool": action.tool,
"risk": action.risk.value,
"summary_shown": summary,
"decision": decision,
"reviewer": reviewer,
"review_latency_sec": review_latency_sec,
}
with self.path.open("a") as f:
f.write(json.dumps(entry) + "\n")The summary_shown field is critical: you need to know exactly what the human saw when they made the decision, not just what the action was. If your summary generation has a bug, you can trace back to which approvals were made on bad summaries.
Avoiding Checkpoint Fatigue
The death of a good checkpoint system is alarm fatigue. If 70% of agent actions require approval, users will approve without reading. To prevent this:
- Tune triggers continuously — track approval rate. If >20% of actions hit checkpoints, tighten the trigger rules.
- Auto-approve patterns — actions the same user has approved 10+ times for the same tool/args pattern are candidates for a trusted-action allowlist.
- Batch low-risk checkpoints — instead of one notification per action, batch multiple low-risk pending actions into a single review.
Key Takeaways
- Checkpoint trigger conditions should be explicit, rule-based, and auditable — not delegated to an LLM's judgment about riskiness.
- Auto-approve reversible actions on timeout; auto-reject irreversible actions on timeout — make this decision explicit at the call site.
- Every checkpoint notification must include the current task context, not just the raw action args.
- Log the exact summary the human saw when approving — not just the action — so you can audit decisions made on bad summaries.
- Monitor checkpoint approval rate: if >20% of actions require approval, trigger rules are miscalibrated and fatigue will degrade review quality.
- A trusted-action allowlist (auto-approve repeated identical patterns) is a safe way to reduce friction without removing oversight.