Planning vs Reacting
Series
Agent EngineeringThe debate is not "should agents plan?" — it is "when does planning pay for itself?" Explicit upfront planning reduces variance on complex, multi-step tasks. It also adds at least one extra LLM call, can generate stale plans the moment reality diverges, and is completely overkill for tasks that resolve in one or two tool calls. The wrong mental model is picking one strategy and applying it universally.
Two Modes, Two Tradeoff Curves
ReAct (Reason + Act) is the default: observe, think, act, repeat. No upfront plan. The agent commits to one step at a time and adapts immediately to new information. It is fast to first action, cheap when tasks are short, and naturally handles partial information.
Plan-and-Execute generates a full plan first, then executes sub-tasks — possibly in parallel. It is better for tasks with 10+ steps, hard dependencies between steps, or when you need to show users a preview of what the agent will do before it does anything.
The key difference is where uncertainty resolution happens. In ReAct, uncertainty resolves step by step. In Plan-and-Execute, it resolves upfront — which is great when the plan is accurate and catastrophic when it is not.
When ReAct is Enough
If your task resolves in ≤5 tool calls and success criteria are clear, ReAct wins on every dimension: latency, cost, and debuggability. The reasoning trace is the plan.
from anthropic import Anthropic
client = Anthropic()
tools = [
{
"name": "search_web",
"description": "Search the web for current information",
"input_schema": {
"type": "object",
"properties": {"query": {"type": "string"}},
"required": ["query"],
},
}
]
def react_loop(task: str, max_steps: int = 8) -> str:
messages = [{"role": "user", "content": task}]
for _ in range(max_steps):
response = client.messages.create(
model="claude-opus-4-5",
max_tokens=1024,
tools=tools,
messages=messages,
)
if response.stop_reason == "end_turn":
return response.content[0].text
# Handle tool use
tool_call = next(b for b in response.content if b.type == "tool_use")
observation = execute_tool(tool_call.name, tool_call.input)
messages += [
{"role": "assistant", "content": response.content},
{"role": "user", "content": [
{"type": "tool_result", "tool_use_id": tool_call.id,
"content": observation}
]},
]
return "Max steps reached"The loop is the plan. No overhead. Audit the messages list and you have the full reasoning trace.
When to Commit to Explicit Planning
Explicit planning pays when:
- Steps are long-running — you want to know what the agent intends before it starts an hour-long job.
- Steps can be parallelized — a planner can identify which sub-tasks are independent and fan them out.
- Users need approval — show them the plan, get sign-off, then execute. This is the human-in-the-loop gate.
- Tasks have hard ordering constraints — the planner encodes the dependency graph, not the agent's in-context reasoning.
from pydantic import BaseModel
import json
class PlanStep(BaseModel):
step_id: str
description: str
tool: str
args: dict
depends_on: list[str] = []
class Plan(BaseModel):
goal: str
steps: list[PlanStep]
def generate_plan(goal: str, llm_client) -> Plan:
prompt = f"""
Generate a step-by-step plan to: {goal}
Return JSON matching this schema:
{{
"goal": "...",
"steps": [
{{"step_id": "s1", "description": "...", "tool": "...",
"args": {{}}, "depends_on": []}}
]
}}
Only include tools: search_web, write_file, read_file, send_email.
"""
resp = llm_client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}],
response_format={"type": "json_object"},
)
return Plan(**json.loads(resp.choices[0].message.content))
def execute_plan(plan: Plan) -> dict[str, str]:
completed: dict[str, str] = {}
# Topological execution — respect depends_on
remaining = list(plan.steps)
while remaining:
ready = [s for s in remaining
if all(dep in completed for dep in s.depends_on)]
if not ready:
raise RuntimeError("Dependency cycle detected")
for step in ready:
completed[step.step_id] = execute_tool(step.tool, step.args)
remaining.remove(step)
return completedThe depends_on field is doing the work planners usually leave to implicit prompt reasoning. Make it explicit and you get parallelism for free.
The Hybrid: Adaptive Planning
The practical answer for production systems is adaptive planning: start with a lightweight ReAct loop, detect when the task is getting complex (step count, tool diversity, user confirmation needed), and escalate to a planner mid-flight.
def adaptive_agent(task: str, complexity_threshold: int = 4) -> str:
steps_taken = 0
messages = [{"role": "user", "content": task}]
# Phase 1: optimistic ReAct
while steps_taken < complexity_threshold:
result, tool_used = react_step(messages)
steps_taken += 1
if not tool_used:
return result # done early, great
# Phase 2: escalate to planner
plan = generate_plan(task, llm_client=openai_client)
return execute_plan(plan)This keeps fast tasks fast and only pays the planning tax for tasks that need it.
Failure Modes to Watch
Stale plans: The environment changes after planning. The executor hits an error, but there is no feedback loop to the planner. Mitigation: checkpoint after each step and re-plan if an executor step fails.
Plan hallucination: The planner invents tools or args that do not exist. Mitigation: validate each PlanStep against a tool registry before execution starts.
Over-planning: A 15-step plan for a task that two ReAct iterations would handle. Mitigation: complexity scoring before routing to the planner.
Key Takeaways
- ReAct is the right default — it is cheaper, faster, and self-documenting through the message trace.
- Explicit planning pays for tasks with 10+ steps, parallelizable sub-tasks, or mandatory human approval before execution.
- Encode dependencies explicitly with a
depends_onfield; do not leave ordering to in-context reasoning. - Adaptive routing (ReAct first, escalate on complexity) gives you the best of both modes without overengineering short tasks.
- Validate plans against a tool registry before execution to catch hallucinated tool calls at planning time, not mid-run.
- Stale plan handling requires a feedback loop from executor back to planner — if you skip this, your Plan-and-Execute agent is fragile.