ReAct — the Reasoning and Acting pattern from the 2022 Princeton paper — became the de facto template for AI agent systems almost overnight. The pattern is elegant: interleave reasoning traces with action calls, letting the agent explain its logic before each tool invocation. Implementations flooded GitHub. Frameworks were built around it.
Three years later, ReAct is still widely used. It is also increasingly understood as one pattern among several, each with distinct strengths, failure modes, and appropriate use cases. The most capable agent systems in production today do not run a single reasoning pattern — they combine multiple patterns depending on the structure of the task at hand.
Here is a systematic look at the five patterns shaping current agent design.
Pattern 1: Tool Use
Tool use is the foundational primitive on which all other agent patterns are built. The model generates a structured call to an external function — a web search, a file read, a database query, an API request — and receives a structured response that is folded back into the context.
The key design decisions in tool use systems are:
Tool schema quality. The model's ability to use a tool correctly is almost entirely determined by how well the tool is described. Ambiguous parameter names, missing constraints, and poorly documented return formats all degrade tool call accuracy.
Error handling discipline. Tool calls fail. The model needs to receive meaningful error messages and have the judgment to distinguish between "retry with different parameters," "this tool cannot help with this task," and "the task itself is invalid."
Tool breadth vs. depth. A model with access to many narrow tools is often more capable on diverse tasks than a model with deep access to one tool category. Breadth enables cross-domain tasks; depth enables precision on specialized tasks. The best configurations offer both.
MCP (Model Context Protocol) addresses the breadth problem at scale. By standardizing the interface between models and tools, MCP allows tool collections to grow modularly — new tools become available to all MCP-compatible agents without any changes to the agent itself. This is why Neumar's integration with 10,000+ MCP skills is genuinely meaningful: it is not just a catalog, it is a structural capability advantage on heterogeneous tasks.
Pattern 2: ReAct (Reasoning + Acting)
ReAct combines a reasoning step ("I need to check the current version of this dependency before suggesting an upgrade") with an action step (tool call to read the package.json). The reasoning trace serves multiple functions: it helps the model maintain task context across multiple steps, and it makes the agent's behavior interpretable to human observers.
ReAct performs well on tasks where:
- The next action is heavily dependent on the result of the previous action
- The task structure is relatively linear (a clear sequence of dependent steps)
- Interpretability is important for debugging or user trust
ReAct struggles when:
- Tasks have significant parallel structure (independent subtasks that could run concurrently)
- The plan needs to change substantially based on early results (ReAct's reasoning traces are local, not strategic)
- The task is very long and early context gets pushed out of the window
Pattern 3: Reflection
Reflection adds a self-evaluation loop: after completing a step or a full task, the agent generates a critique of its own output and optionally revises. The critique might identify factual errors, logical inconsistencies, missed edge cases, or better approaches.
Reflexion (the 2023 Shinn et al. paper) formalized this into a three-phase loop: Act, Evaluate, Reflect — where the reflection from one episode is carried forward as a verbal memory to inform the next attempt.
Practical reflection implementations face a tension: thorough self-critique takes tokens and adds latency, but shallow self-critique adds noise without improving output quality. The models that benefit most from reflection are those capable of genuinely identifying their own errors — which tends to correlate with overall model capability. On weaker models, reflection loops often just add confident-sounding restatements of the original output.
For production agent systems, reflection is most valuable as an optional post-hoc quality gate rather than a mandatory step in every execution path.
Pattern 4: Planning
Planning-first patterns separate task decomposition from task execution. The agent begins by generating a structured representation of the task — often a directed graph or ordered list of subtasks with dependency relationships — before making any tool calls.
This is a qualitative shift from ReAct. In ReAct, the "plan" is implicit in the reasoning traces. In explicit planning systems, the plan is a first-class artifact that can be inspected, modified, and replanned independently of execution.
The planning pattern outperforms ReAct on tasks where:
- Early actions have irreversible consequences (submitting a form, sending a message, making an API call that incurs cost)
- The task structure is complex enough that local step-by-step reasoning produces suboptimal global decisions
- User oversight is important — inspecting a plan before execution is much more tractable than monitoring a ReAct trace in real time
This is the pattern at the core of Neumar's two-phase execution model. The planning phase produces a structured task graph. The execution phase works through the graph, with the ability to replan when execution diverges from the plan. Users can inspect and modify the plan before committing to execution — a property that is particularly valuable for tasks with external side effects.
Planning and the Horizon Problem
One nuance of planning-first systems is that plan quality degrades as task horizon extends. A plan for a two-step task is almost always better than executing those steps reactively. A plan for a 50-step task is speculative at best — too many unknowns will have materialized by step 25 for the original plan to remain optimal.
Well-designed planning systems handle this by treating the plan as a living document with scheduled replanning checkpoints, not a fixed script. The plan informs execution without constraining recovery.
Pattern 5: Multi-Agent Collaboration
Multi-agent patterns distribute a task across multiple specialized agents that communicate through structured interfaces. This is the pattern CoAct and similar systems formalize — a global planner delegates to local executors, with information flowing between levels as tasks are assigned and results are returned.
Multi-agent collaboration unlocks capabilities that are structurally unavailable to single-agent systems:
- Parallelism: independent subtasks run on separate agents concurrently
- Specialization: different agents are optimized for different task types
- Scale: tasks that exceed a single context window are handled by distributing context across agents
The coordination overhead is real. Multi-agent systems require careful design of agent interfaces, message schemas, and failure propagation logic. But for tasks that are genuinely too large or too heterogeneous for a single-agent approach, the coordination cost is well worth paying.
How These Patterns Compose
In practice, the most capable agent systems combine multiple patterns within a single execution:
- A planning agent decomposes the task and produces a structured task graph
- Subtasks are routed through an agent registry to specialized execution agents (multi-agent)
- Execution agents use tool use with a broad MCP tool set to carry out their subtasks
- Each execution agent uses ReAct at the step level to sequence tool calls
- A reflection gate evaluates outputs before returning results to the planner
This is not a hypothetical architecture — it describes how Neumar's more complex workflows operate, particularly the Linear ticket-to-PR pipeline, where a top-level planning agent decomposes the ticket into subtasks (research, implementation, test writing, PR creation), routes each to appropriate execution agents, and validates outputs before composing the final deliverable.
Choosing the Right Pattern
The practical decision framework comes down to task structure:
| Task Property | Recommended Pattern |
|---|---|
| Single tool call, well-specified | Tool use only |
| Sequential, dependent steps | ReAct |
| Irreversible actions, user oversight needed | Planning-first |
| Output quality critical | Reflection gate |
| Parallel structure, large scope | Multi-agent |
| All of the above | Composed system |
The worst outcome is not picking the wrong pattern — it is assuming that one pattern handles everything. The research on agent capability is unambiguous: task-appropriate reasoning architecture matters as much as model quality for production performance.
References
- Yao et al., "ReAct: Synergizing Reasoning and Acting in Language Models" (Princeton, 2022)
- Shinn et al., "Reflexion: Language Agents with Verbal Reinforcement Learning" (2023)
- Chen et al., "CoAct: A Global-Local Hierarchy for Autonomous Agent Collaboration" (CMU, 2024)
- Anthropic, "Building effective agents"
