Claude Code reached a milestone in early 2026 that deserves attention: it now authors approximately 135,000 public GitHub commits per day, roughly 4% of all public commits. Anthropic projects this will reach 20% by year-end. Daily installs grew from 17.7 million to 29 million on a 30-day moving average. The revenue run-rate hit an estimated $2.5 billion, up from $500 million in September 2025.
These numbers matter because they are not driven by autocomplete volume or suggestion acceptance rates. Claude Code operates as an autonomous agent that plans, executes, and verifies multi-step changes. Each commit represents a completed task, not a suggestion. Understanding how the agent loop works — and why it works differently from what you might expect — reveals decisions that are relevant to anyone building agent systems.
The Surprising Simplicity of the Master Loop
The core of Claude Code is internally codenamed "nO" — a straightforward while(tool_call) loop. There is no classifier. No RAG pipeline. No DAG orchestrator. No task planner separate from the executor. The model itself makes every decision: what to read, what to search, what to edit, when to run tests, when to stop.
while (true) {
response = model.generate(messages, tools)
if (response.has_tool_calls) {
for (call of response.tool_calls) {
result = execute_tool(call)
messages.append(tool_result(result))
}
} else {
// Agent is done
break
}
}
This is not a simplification for explanation purposes. The actual architecture is this simple at the orchestration level. The complexity lives in two places: the tool implementations (which provide rich capabilities) and the model's reasoning (which decides how to use them).
Anthropic's engineering team arrived at this design after experimenting with more complex architectures — separate planning phases, routing classifiers, retrieval-augmented generation for codebase context — and finding that the simpler approach produced better results. The model's own reasoning, particularly with extended thinking enabled, handles planning and context management more effectively than external orchestration logic.
Three-Phase Execution Pattern
While the loop is simple, the model's behavior within the loop follows a consistent three-phase pattern:
Phase 1: Context Gathering. The model reads files, searches the codebase, examines git history, and builds an understanding of the relevant code. This phase uses tools like Read, Glob, Grep, and Bash (for git commands). The model decides what to read based on the task description and what it discovers as it reads.
Phase 2: Action. The model makes changes — editing files, writing new files, running shell commands. It may make multiple rounds of changes within this phase, iterating as it encounters issues.
Phase 3: Verification. The model runs tests, type checkers, linters, or other validation tools to confirm the changes work. If verification fails, it loops back to Phase 2 to fix issues, potentially multiple times.
The key insight is that these phases are not implemented as separate stages in the orchestrator. They emerge from the model's reasoning. The model naturally gathers context before acting and verifies after acting because the system prompt and extended thinking encourage this pattern. Attempting to enforce these phases externally — with hard transitions between planning and execution modes — produced worse results in Anthropic's internal benchmarks.
Context Compaction: Enabling Infinite Conversations
One of Claude Code's most technically interesting components is the compressor, internally codenamed "wU2." When a conversation approaches the context window limit, the compressor summarizes older messages while preserving critical information — file paths mentioned, decisions made, errors encountered, and the current state of the task.
This enables effectively unbounded conversations. A developer can start a task, work through multiple iterations, encounter and resolve errors, and continue building on the accumulated context without hitting a wall. The compressor's summarization preserves the information the model needs to continue working effectively while freeing token budget for new content.
With the March 2026 release of 1M token context at standard pricing for Opus 4.6, the compressor activates less frequently — but it remains essential for long, complex tasks that generate substantial tool output.
Agentic Search: Why Ripgrep Beats RAG
For codebase search, Claude Code uses what Anthropic calls "agentic search" — the model uses ripgrep (via the Grep tool) to search for patterns, reads the results, refines its search based on what it finds, and iterates. This is a multi-turn, model-directed search rather than a single-shot retrieval.
Anthropic benchmarked this approach against embedding-based RAG retrieval and found that agentic search produced superior results with lower complexity. The reasons are instructive:
Adaptive query refinement. The model can reformulate its search based on intermediate results. If a grep for function handleAuth finds nothing, the model can try handleAuth, then auth handler, then search for the file that imports from the auth module. RAG systems perform a single embedding lookup and return whatever is closest in vector space.
No index maintenance. Agentic search works on the current state of the codebase without requiring a pre-built embedding index. This matters because codebases change frequently, and index staleness is a real problem in RAG-based code assistants.
Semantic understanding in the loop. The model can understand the semantic meaning of search results and decide what to search for next. A grep result that mentions a related function in a comment can lead the model to search for that function — a connection that embedding similarity would not reliably capture.
Tool Search: Dynamic MCP Loading
Claude Code supports MCP servers for extensibility, and its approach to MCP tool management is worth examining. Rather than loading all available MCP tools into the context at the start of every conversation — which would consume substantial token budget — Claude Code uses a Tool Search mechanism that dynamically loads tool definitions on demand.
When the model needs a capability that is not in its current tool set, it can search for relevant tools by keyword. The matching tools' full schemas are loaded into context, making them available for subsequent calls. This reduced tokens spent on MCP tool definitions by up to 46.9% in Anthropic's benchmarks.
This is a practical solution to a scaling problem: as the MCP ecosystem grows (currently 10,000+ community servers), loading every available tool into context becomes infeasible. Dynamic loading allows the model to access a vast tool ecosystem while keeping the active context focused.
Performance Benchmarks
| Benchmark | Claude Code (Opus 4.6) | Score |
|---|---|---|
| SWE-bench Verified (autonomous) | 80.9% | State of the art |
| SWE-bench Pro | 59% | Leading |
| MRCR v2 (1M context) | 78.3% | Highest among frontier models |
| Blind code quality test win rate | 67% | vs. all competitors |
| Metric | Value | Period |
|---|---|---|
| Daily GitHub commits | ~135,000 | March 2026 |
| Share of public GitHub commits | ~4% | March 2026 |
| Daily installs (30-day avg) | 29M | March 2026 |
| Revenue run-rate | ~$2.5B | Early 2026 |
| Developer "most loved" rating | 46% | Survey 2026 |
The Code Review System
In March 2026, Anthropic launched Code Review — a multi-agent extension of the Claude Code architecture. Rather than a single agent loop, Code Review dispatches parallel agents on every pull request. Each agent independently analyzes the diff, identifies potential issues, and assesses severity. A meta-agent then aggregates findings, filters false positives, and produces a final review.
The results are striking: before Code Review, 16% of PRs received substantive review comments. After, 54% do. The average review costs $15-25 and takes approximately 20 minutes. This is an example of the master loop architecture scaling horizontally — the same simple agent loop, running in parallel instances with different review perspectives, coordinated by a lightweight aggregation layer.
Architectural Lessons
Claude Code's architecture challenges several assumptions common in the agent development community:
Simpler orchestration often outperforms complex orchestration. The industry trend toward elaborate multi-agent systems, DAG-based task planners, and specialized routing classifiers may be overcomplicating the problem. A capable model with good tools and a simple loop can outperform these systems on many tasks.
The model is a better planner than your planner. External planning stages that constrain the model's execution often perform worse than letting the model plan and execute fluidly within the same loop. Extended thinking provides the reasoning budget for effective planning without requiring a separate planning phase.
Dynamic tool loading is essential at scale. As tool ecosystems grow, static tool loading becomes a context budget problem. Dynamic, search-based tool loading is a scalable alternative that maintains access to large tool libraries without consuming excessive tokens.
Verification should be model-directed, not hardcoded. The model's decision about what to verify and how to verify it is more effective than a fixed verification pipeline, because the model understands what changed and what the likely failure modes are.
References
- Anthropic, "How Claude Code Works" — official documentation
- PromptLayer, "Claude Code: Behind-the-scenes of the master agent loop"
- DeepWiki, "System Architecture — Claude Code"
- Anthropic, "Code Review for Claude Code" (March 9, 2026)
- Anthropic, "1M Context GA for Opus 4.6" (March 13, 2026)
- UncoverAlpha, "Anthropic's Claude Code Is Having Its ChatGPT Moment"
- SWE-bench Leaderboard
