Agent Engineering

The Agent Loop and Halting Conditions

Ravinder·February 1, 2025·6 min read

AgentsAILLMArchitectureControl Flow

Series

Agent Engineering

Part 1 of 10

Start of series

Part 2 →

Tool Design for Autonomy

Every agent is, at its core, a loop. Strip away the clever prompting, the tool registries, the orchestration frameworks—what remains is an iteration: observe the environment, decide on an action, execute it, observe again. This sounds simple. It is not.

The part that kills most agent deployments in production is not the intelligence of the model inside the loop. It is the loop itself: poorly defined exit conditions, runaway iterations, or agents that confidently spin forever on a task they have no hope of completing. This post is about getting the loop right.

Anatomy of the Standard Agent Loop

The canonical agent loop follows a four-phase cycle that borrows heavily from the OODA loop (Observe, Orient, Decide, Act) and the ReAct pattern (Reason + Act):

flowchart TD Start([Task Received]) --> Observe Observe[Observe: gather context\nread tools, memory, env state] --> Think Think[Think: reason about\ncurrent state vs goal] --> Decide{Goal reached\nor halt condition?} Decide -->|No| Act Act[Act: call tool or\ngenerate output] --> Observe Decide -->|Yes| Stop([Terminate]) Act -->|Error / timeout| ErrHandle[Error handler] ErrHandle -->|Recoverable| Observe ErrHandle -->|Fatal| Stop

Each phase carries distinct responsibilities:

Observe — The agent reads everything relevant: tool outputs from prior steps, user-injected context, memory retrievals, environment variables. This is where context window management starts to matter. If you stuff the full history of every tool call into every iteration, you will hit limits fast.

Think — This is the LLM call. The model reasons over the observed state and produces either a tool call or a final answer. The reasoning trace here is not just cosmetic—it is your primary debugging surface.

Decide — Before executing the action, the loop should check halting conditions. Most frameworks bury this check or skip it entirely. Do not skip it.

Act — Invoke the tool, apply the side effect, or emit the response. Record the result faithfully for the next observation phase.

What a Minimal Loop Looks Like in Code

from typing import Any, Callable
 
MAX_ITERATIONS = 20
 
def agent_loop(
    task: str,
    tools: dict[str, Callable],
    llm_call: Callable,
    halt_fn: Callable[[dict], bool],
) -> str:
    history: list[dict] = [{"role": "user", "content": task}]
    iterations = 0
 
    while iterations < MAX_ITERATIONS:
        iterations += 1
        response = llm_call(history)
 
        if halt_fn(response):
            return response["content"]
 
        tool_name = response.get("tool_call")
        if not tool_name:
            # Model emitted a final answer without a tool call
            return response["content"]
 
        if tool_name not in tools:
            history.append({
                "role": "tool",
                "content": f"Error: unknown tool '{tool_name}'",
            })
            continue
 
        try:
            result = tools[tool_name](**response.get("tool_args", {}))
        except Exception as exc:
            result = f"ToolError: {exc}"
 
        history.append({"role": "assistant", **response})
        history.append({"role": "tool", "content": str(result)})
 
    return "Agent halted: max iterations reached without resolution."

Notice the explicit MAX_ITERATIONS guard. This is non-negotiable. Without it, a buggy tool or a confused model will loop until you hit your rate limit or your cloud bill explodes.

Halting Conditions: The Full Taxonomy

Most tutorials define exactly one halting condition: the model says it is done. This is insufficient. A production agent needs a layered set of conditions.

graph LR HC[Halting Conditions] --> Goal[Goal Satisfied\nmodel signals completion] HC --> Budget[Budget Exhausted\niterations / tokens / time] HC --> Error[Unrecoverable Error\ntool failure / permission denied] HC --> Confidence[Low Confidence\nmodel uncertainty signal] HC --> External[External Signal\nhuman interrupt / timeout]

Goal Satisfied — The model signals it has completed the task, typically by emitting a final answer rather than a tool call. Parse this explicitly; do not rely on the model's word alone. Cross-validate against a goal function where possible.

Budget Exhausted — Hard limits on iterations, token consumption, and wall-clock time. Set all three. Treat them as separate circuit breakers, not redundant ones—a loop can hit max-tokens without hitting max-iterations if each turn is expensive.

Unrecoverable Error — Some errors should terminate the loop immediately: permission denied on a critical resource, a dependency service returning 5xx three times in a row, schema validation failure on the agent's own output. Define these up front in your error taxonomy.

Low Confidence — Harder to operationalize but valuable. If the model's reasoning trace contains phrases like "I'm not sure how to proceed" or if your tool call success rate in the last N steps is below a threshold, surface this to a supervisor rather than continuing blindly.

External Signal — The user cancels, a timeout fires from the calling system, or a watchdog process decides the agent has run too long. Your loop must listen for these signals between iterations.

The Iteration Budget Is a First-Class Parameter

Treat the iteration budget the same way you treat a time budget on an HTTP request: set it deliberately, expose it in your observability, and alarm on it. A task that routinely exhausts its budget is either misconfigured, hitting an edge case your evals did not cover, or working on a problem that requires redesign.

from dataclasses import dataclass
 
@dataclass
class AgentBudget:
    max_iterations: int = 20
    max_input_tokens: int = 50_000
    max_wall_seconds: float = 120.0
 
    def check(self, iterations: int, tokens_used: int, elapsed: float) -> str | None:
        if iterations >= self.max_iterations:
            return "max_iterations"
        if tokens_used >= self.max_input_tokens:
            return "max_tokens"
        if elapsed >= self.max_wall_seconds:
            return "timeout"
        return None

Log the halt reason every single time. After a week of production data, you will know exactly which tasks are hardest for your agent and where to invest in better tools or prompts.

Loop Invariants You Should Enforce

Beyond halting, maintain invariants that keep the loop from drifting into pathological states:

Monotonic progress — Each iteration should produce a new observation that was not available before. If the last N tool calls returned identical results, the agent is stuck. Detect and break the cycle.
Bounded history growth — Do not append raw tool outputs indefinitely. Summarize or truncate older history as context fills up.
Deterministic tool identity — The same tool name should always resolve to the same implementation within a single loop run. Hot-swapping tool implementations mid-run creates debugging nightmares.
Explicit state transitions — Every state in the loop (observing, thinking, acting, halted) should be logged. Silent transitions make post-mortems impossible.

Key Takeaways

Every agent is a loop: observe, think, decide, act—repeat. The quality of your agent is largely the quality of your loop implementation.
Never rely on a single halting condition. Layer goal detection, budget limits, error taxonomies, and external signals.
MAX_ITERATIONS is mandatory and non-negotiable; a missing cap is a production incident waiting to happen.
The iteration budget is a first-class operational parameter—expose it, monitor it, and alarm on exhaustion.
Log every loop state transition and every halt reason; this data is your primary debugging and optimization surface.
Maintain loop invariants: monotonic progress, bounded history, deterministic tool identity, and explicit state logging.

Series

Agent Engineering

Part 1 of 10

Start of series

Part 2 →

Tool Design for Autonomy