Agent Engineering

Planning vs Reacting

Ravinder·February 22, 2025·5 min read

AgentsAILLMPlanningReAct

Series

Agent Engineering

Part 4 of 10

← Part 3

Memory: Short, Long, Episodic

Part 5 →

Multi-Agent Orchestration Tradeoffs

The debate is not "should agents plan?" — it is "when does planning pay for itself?" Explicit upfront planning reduces variance on complex, multi-step tasks. It also adds at least one extra LLM call, can generate stale plans the moment reality diverges, and is completely overkill for tasks that resolve in one or two tool calls. The wrong mental model is picking one strategy and applying it universally.

Two Modes, Two Tradeoff Curves

ReAct (Reason + Act) is the default: observe, think, act, repeat. No upfront plan. The agent commits to one step at a time and adapts immediately to new information. It is fast to first action, cheap when tasks are short, and naturally handles partial information.

Plan-and-Execute generates a full plan first, then executes sub-tasks — possibly in parallel. It is better for tasks with 10+ steps, hard dependencies between steps, or when you need to show users a preview of what the agent will do before it does anything.

flowchart LR subgraph ReAct R1[Observe] --> R2[Think] --> R3[Act] --> R4{Done?} R4 -- no --> R1 R4 -- yes --> R5[Output] end subgraph Plan-and-Execute P1[Parse Goal] --> P2[Generate Plan] P2 --> P3[Step 1] P2 --> P4[Step 2] P2 --> P5[Step N] P3 & P4 & P5 --> P6[Aggregate Output] end

The key difference is where uncertainty resolution happens. In ReAct, uncertainty resolves step by step. In Plan-and-Execute, it resolves upfront — which is great when the plan is accurate and catastrophic when it is not.

When ReAct is Enough

If your task resolves in ≤5 tool calls and success criteria are clear, ReAct wins on every dimension: latency, cost, and debuggability. The reasoning trace is the plan.

from anthropic import Anthropic
 
client = Anthropic()
tools = [
    {
        "name": "search_web",
        "description": "Search the web for current information",
        "input_schema": {
            "type": "object",
            "properties": {"query": {"type": "string"}},
            "required": ["query"],
        },
    }
]
 
def react_loop(task: str, max_steps: int = 8) -> str:
    messages = [{"role": "user", "content": task}]
    for _ in range(max_steps):
        response = client.messages.create(
            model="claude-opus-4-5",
            max_tokens=1024,
            tools=tools,
            messages=messages,
        )
        if response.stop_reason == "end_turn":
            return response.content[0].text
        # Handle tool use
        tool_call = next(b for b in response.content if b.type == "tool_use")
        observation = execute_tool(tool_call.name, tool_call.input)
        messages += [
            {"role": "assistant", "content": response.content},
            {"role": "user", "content": [
                {"type": "tool_result", "tool_use_id": tool_call.id,
                 "content": observation}
            ]},
        ]
    return "Max steps reached"

The loop is the plan. No overhead. Audit the messages list and you have the full reasoning trace.

When to Commit to Explicit Planning

Explicit planning pays when:

Steps are long-running — you want to know what the agent intends before it starts an hour-long job.
Steps can be parallelized — a planner can identify which sub-tasks are independent and fan them out.
Users need approval — show them the plan, get sign-off, then execute. This is the human-in-the-loop gate.
Tasks have hard ordering constraints — the planner encodes the dependency graph, not the agent's in-context reasoning.

from pydantic import BaseModel
import json
 
class PlanStep(BaseModel):
    step_id: str
    description: str
    tool: str
    args: dict
    depends_on: list[str] = []
 
class Plan(BaseModel):
    goal: str
    steps: list[PlanStep]
 
def generate_plan(goal: str, llm_client) -> Plan:
    prompt = f"""
    Generate a step-by-step plan to: {goal}
    
    Return JSON matching this schema:
    {{
      "goal": "...",
      "steps": [
        {{"step_id": "s1", "description": "...", "tool": "...",
          "args": {{}}, "depends_on": []}}
      ]
    }}
    Only include tools: search_web, write_file, read_file, send_email.
    """
    resp = llm_client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
        response_format={"type": "json_object"},
    )
    return Plan(**json.loads(resp.choices[0].message.content))
 
def execute_plan(plan: Plan) -> dict[str, str]:
    completed: dict[str, str] = {}
    # Topological execution — respect depends_on
    remaining = list(plan.steps)
    while remaining:
        ready = [s for s in remaining
                 if all(dep in completed for dep in s.depends_on)]
        if not ready:
            raise RuntimeError("Dependency cycle detected")
        for step in ready:
            completed[step.step_id] = execute_tool(step.tool, step.args)
            remaining.remove(step)
    return completed

The depends_on field is doing the work planners usually leave to implicit prompt reasoning. Make it explicit and you get parallelism for free.

The Hybrid: Adaptive Planning

The practical answer for production systems is adaptive planning: start with a lightweight ReAct loop, detect when the task is getting complex (step count, tool diversity, user confirmation needed), and escalate to a planner mid-flight.

def adaptive_agent(task: str, complexity_threshold: int = 4) -> str:
    steps_taken = 0
    messages = [{"role": "user", "content": task}]
 
    # Phase 1: optimistic ReAct
    while steps_taken < complexity_threshold:
        result, tool_used = react_step(messages)
        steps_taken += 1
        if not tool_used:
            return result  # done early, great
 
    # Phase 2: escalate to planner
    plan = generate_plan(task, llm_client=openai_client)
    return execute_plan(plan)

This keeps fast tasks fast and only pays the planning tax for tasks that need it.

Failure Modes to Watch

Stale plans: The environment changes after planning. The executor hits an error, but there is no feedback loop to the planner. Mitigation: checkpoint after each step and re-plan if an executor step fails.

Plan hallucination: The planner invents tools or args that do not exist. Mitigation: validate each PlanStep against a tool registry before execution starts.

Over-planning: A 15-step plan for a task that two ReAct iterations would handle. Mitigation: complexity scoring before routing to the planner.

Key Takeaways

ReAct is the right default — it is cheaper, faster, and self-documenting through the message trace.
Explicit planning pays for tasks with 10+ steps, parallelizable sub-tasks, or mandatory human approval before execution.
Encode dependencies explicitly with a depends_on field; do not leave ordering to in-context reasoning.
Adaptive routing (ReAct first, escalate on complexity) gives you the best of both modes without overengineering short tasks.
Validate plans against a tool registry before execution to catch hallucinated tool calls at planning time, not mid-run.
Stale plan handling requires a feedback loop from executor back to planner — if you skip this, your Plan-and-Execute agent is fragile.

Series

Agent Engineering

Part 4 of 10

← Part 3

Memory: Short, Long, Episodic

Part 5 →

Multi-Agent Orchestration Tradeoffs