Beyond ReAct: The Five Reasoning Patterns Shaping Next-Generation AI Agents

ReAct — the Reasoning and Acting pattern from the 2022 Princeton paper — became the de facto template for AI agent systems almost overnight. The pattern is elegant: interleave reasoning traces with action calls, letting the agent explain its logic before each tool invocation. Implementations flooded GitHub. Frameworks were built around it.

Three years later, ReAct is still widely used. It is also increasingly understood as one pattern among several, each with distinct strengths, failure modes, and appropriate use cases. The most capable agent systems in production today do not run a single reasoning pattern — they combine multiple patterns depending on the structure of the task at hand.

Here is a systematic look at the five patterns shaping current agent design.

Pattern 1: Tool Use

Tool use is the foundational primitive on which all other agent patterns are built. The model generates a structured call to an external function — a web search, a file read, a database query, an API request — and receives a structured response that is folded back into the context.

The key design decisions in tool use systems are:

Tool schema quality. The model's ability to use a tool correctly is almost entirely determined by how well the tool is described. Ambiguous parameter names, missing constraints, and poorly documented return formats all degrade tool call accuracy.

Error handling discipline. Tool calls fail. The model needs to receive meaningful error messages and have the judgment to distinguish between "retry with different parameters," "this tool cannot help with this task," and "the task itself is invalid."

Tool breadth vs. depth. A model with access to many narrow tools is often more capable on diverse tasks than a model with deep access to one tool category. Breadth enables cross-domain tasks; depth enables precision on specialized tasks. The best configurations offer both.

MCP (Model Context Protocol) addresses the breadth problem at scale. By standardizing the interface between models and tools, MCP allows tool collections to grow modularly — new tools become available to all MCP-compatible agents without any changes to the agent itself. This is why Neumar's integration with 10,000+ MCP skills is genuinely meaningful: it is not just a catalog, it is a structural capability advantage on heterogeneous tasks.

Pattern 2: ReAct (Reasoning + Acting)

ReAct combines a reasoning step ("I need to check the current version of this dependency before suggesting an upgrade") with an action step (tool call to read the package.json). The reasoning trace serves multiple functions: it helps the model maintain task context across multiple steps, and it makes the agent's behavior interpretable to human observers.

ReAct performs well on tasks where:

The next action is heavily dependent on the result of the previous action
The task structure is relatively linear (a clear sequence of dependent steps)
Interpretability is important for debugging or user trust

ReAct struggles when:

Tasks have significant parallel structure (independent subtasks that could run concurrently)
The plan needs to change substantially based on early results (ReAct's reasoning traces are local, not strategic)
The task is very long and early context gets pushed out of the window

Pattern 3: Reflection

Reflection adds a self-evaluation loop: after completing a step or a full task, the agent generates a critique of its own output and optionally revises. The critique might identify factual errors, logical inconsistencies, missed edge cases, or better approaches.

Reflexion (the 2023 Shinn et al. paper) formalized this into a three-phase loop: Act, Evaluate, Reflect — where the reflection from one episode is carried forward as a verbal memory to inform the next attempt.

Practical reflection implementations face a tension: thorough self-critique takes tokens and adds latency, but shallow self-critique adds noise without improving output quality. The models that benefit most from reflection are those capable of genuinely identifying their own errors — which tends to correlate with overall model capability. On weaker models, reflection loops often just add confident-sounding restatements of the original output.

For production agent systems, reflection is most valuable as an optional post-hoc quality gate rather than a mandatory step in every execution path.

Pattern 4: Planning

Planning-first patterns separate task decomposition from task execution. The agent begins by generating a structured representation of the task — often a directed graph or ordered list of subtasks with dependency relationships — before making any tool calls.

This is a qualitative shift from ReAct. In ReAct, the "plan" is implicit in the reasoning traces. In explicit planning systems, the plan is a first-class artifact that can be inspected, modified, and replanned independently of execution.

The planning pattern outperforms ReAct on tasks where:

Early actions have irreversible consequences (submitting a form, sending a message, making an API call that incurs cost)
The task structure is complex enough that local step-by-step reasoning produces suboptimal global decisions
User oversight is important — inspecting a plan before execution is much more tractable than monitoring a ReAct trace in real time

This is the pattern at the core of Neumar's two-phase execution model. The planning phase produces a structured task graph. The execution phase works through the graph, with the ability to replan when execution diverges from the plan. Users can inspect and modify the plan before committing to execution — a property that is particularly valuable for tasks with external side effects.

Planning and the Horizon Problem

One nuance of planning-first systems is that plan quality degrades as task horizon extends. A plan for a two-step task is almost always better than executing those steps reactively. A plan for a 50-step task is speculative at best — too many unknowns will have materialized by step 25 for the original plan to remain optimal.

Well-designed planning systems handle this by treating the plan as a living document with scheduled replanning checkpoints, not a fixed script. The plan informs execution without constraining recovery.

Pattern 5: Multi-Agent Collaboration

Multi-agent patterns distribute a task across multiple specialized agents that communicate through structured interfaces. This is the pattern CoAct and similar systems formalize — a global planner delegates to local executors, with information flowing between levels as tasks are assigned and results are returned.

Multi-agent collaboration unlocks capabilities that are structurally unavailable to single-agent systems:

Parallelism: independent subtasks run on separate agents concurrently
Specialization: different agents are optimized for different task types
Scale: tasks that exceed a single context window are handled by distributing context across agents

The coordination overhead is real. Multi-agent systems require careful design of agent interfaces, message schemas, and failure propagation logic. But for tasks that are genuinely too large or too heterogeneous for a single-agent approach, the coordination cost is well worth paying.

How These Patterns Compose

In practice, the most capable agent systems combine multiple patterns within a single execution:

A planning agent decomposes the task and produces a structured task graph
Subtasks are routed through an agent registry to specialized execution agents (multi-agent)
Execution agents use tool use with a broad MCP tool set to carry out their subtasks
Each execution agent uses ReAct at the step level to sequence tool calls
A reflection gate evaluates outputs before returning results to the planner

This is not a hypothetical architecture — it describes how Neumar's more complex workflows operate, particularly the Linear ticket-to-PR pipeline, where a top-level planning agent decomposes the ticket into subtasks (research, implementation, test writing, PR creation), routes each to appropriate execution agents, and validates outputs before composing the final deliverable.

Choosing the Right Pattern

The practical decision framework comes down to task structure:

Task Property	Recommended Pattern
Single tool call, well-specified	Tool use only
Sequential, dependent steps	ReAct
Irreversible actions, user oversight needed	Planning-first
Output quality critical	Reflection gate
Parallel structure, large scope	Multi-agent
All of the above	Composed system

The worst outcome is not picking the wrong pattern — it is assuming that one pattern handles everything. The research on agent capability is unambiguous: task-appropriate reasoning architecture matters as much as model quality for production performance.