Agent System
Two-phase agent execution with planning, approval, and tool-powered execution across multiple AI backends.
The desktop app uses a two-phase agent system where every task goes through planning and execution. This gives you full visibility and control over what the AI agent does before it acts.
Two-Phase Workflow
Phase 1: Planning
When you submit a task, the agent analyzes your request and proposes a structured plan:
- You enter a prompt describing what you want done
- The agent thinks and creates a step-by-step plan
- The plan is displayed with details about:
- Steps to complete
- Files to create or modify
- Tools the agent needs to use
- Scope of changes
- You review the plan and can approve, modify, or reject it
Phase 2: Execution
Once you approve the plan, the agent executes with full tool access:
- The agent begins working through the plan steps
- Progress streams in real-time as the agent:
- Creates and modifies files in your workspace
- Executes code in sandboxed environments
- Calls MCP tools (Linear, Google, media generation, etc.)
- Reports progress and intermediate results
- You see everything -- tool calls, file changes, and agent reasoning
- Results appear in the task view and generated files go to the Library
Agent Backends
The system supports multiple AI backends through a plugin architecture:
| Backend | Engine | Best For |
|---|---|---|
| Claude | Anthropic Claude Agent SDK | Complex reasoning, code generation, analysis |
| Codex | OpenAI Codex | Code-focused tasks |
| DeepAgents | Custom | Specialized deep workflows |
Each backend implements the same BaseAgent interface, providing consistent behavior:
- Session management and cleanup
- Plan lifecycle (create, approve, execute)
- Workspace isolation enforcement
- Streaming message delivery
Tool Integration
During execution, agents have access to a wide range of tools:
Built-in Tool Categories
| Category | Examples | Tool Count |
|---|---|---|
| MCP Servers | Sandbox, filesystem, sequential-thinking | Varies |
| Linear | Create/update issues, manage labels, comments | 18 tools |
| Google Workspace | Gmail, Calendar, Drive, Sheets, Docs, and more | 79 tools |
| Media Generation | Image creation, video generation | Provider-dependent |
| Memory | Recall, store, forget, list facts | 4 tools |
| Speech | Text-to-speech, speech-to-text | 2 tools |
Sandbox Execution
Agents can run code in sandboxed environments:
- Native sandbox: Process-based execution
- Claude sandbox: Container isolation with volume mounts
- Instance pool: Up to 5 concurrent sandbox instances
The sandbox provides run_script and run_command tools for safe code execution.
Execution Modes
Standard Mode
The default mode where the agent makes sequential tool calls with reasoning between each step. This is best for complex tasks that require careful decision-making.
Batch Mode (PTC)
Programmatic Tool Calling mode uses the Messages API with Python code execution for efficient bulk operations. Enable this in Settings > General for tasks that involve many repetitive tool calls.
Session Management
Each task runs within a session that provides:
- Isolated state -- Sessions don't interfere with each other
- Abort control -- Cancel a running task at any time
- Heartbeat detection -- Monitors agent health during thinking phases
- Automatic cleanup -- Stale sessions are cleaned up after 1 hour of inactivity
Message Types
During execution, the agent streams different message types:
| Type | Description |
|---|---|
text | Agent's text response |
thinking | Agent's internal reasoning (when visible) |
tool_use | Agent calling a tool |
tool_result | Result from a tool call |
plan | Structured plan proposal |
direct_answer | Quick response without tool use |
error | Error during execution |
done | Task complete |
Cost Tracking
Every message records:
- Cost in dollars
- Token usage: input, output, cache read, cache creation
- Model used
- Duration in milliseconds
This information appears in the task view toolbar, helping you monitor spending per task.
Runtime Context
The agent receives contextual information before each task:
- Date and time with your timezone
- Locale preference
- Platform details (OS, architecture)
- Geolocation (approximate, privacy-preserving -- rounded to ~1 km)
This context helps the agent provide location-aware, timezone-correct responses.
Conversation History
For long-running sessions, the system manages conversation history:
- Token-based truncation to stay within context limits
- Memory flush when conversation exceeds ~16K tokens
- Important facts are captured to long-term memory before truncation
Learn More
- MCP and Skills -- Extending agent tool access
- Memory System -- How agents remember across sessions
- Workspace Security -- How agent actions are sandboxed