Multi-Agent Orchestration: How the Coordinator Pattern Enables AI Collaboration
A deep dive into Claude Code's multi-agent architecture — Coordinator/Worker role separation, AgentTool spawning mechanics, task notifications, Scratchpad sharing, and Team swarm mode
The Problem
What do you do when a single AI agent isn't enough? The intuitive answer is: spawn more agents. But the problem is far from that simple. How do multiple agents divide work? How do they communicate? How do they share context without conflicts? When a worker finishes a task, how does it report results to the coordinator? If a worker heads in the wrong direction, how do you terminate it promptly?
At the heart of these questions lies a classic distributed systems design challenge — except here the "nodes" aren't servers but LLM instances. Claude Code answers these questions with an elegant multi-agent architecture whose core pattern is called Coordinator/Worker.
This article dives deep into Claude Code's multi-agent system, starting from the overall architecture of the Coordinator pattern and analyzing layer by layer: AgentTool's agent spawning mechanism, the Worker's restricted tool set, the XML protocol for task notifications, SendMessageTool's cross-agent communication, Scratchpad directory persistent state sharing, background execution and progress tracking, Team swarm mode, and MCP Server inheritance and isolation across agents.
Overall Architecture of the Coordinator Pattern
Claude Code's multi-agent system is built on a clear role separation model: the Coordinator is responsible for understanding user intent, decomposing tasks, and synthesizing results; Workers are responsible for executing concrete work (research, implementation, verification). The Coordinator doesn't directly operate on files or run commands — it only has a minimal set of "management tools."
The Design Philosophy of Role Separation
The philosophy behind this design is simple: let the Coordinator focus on "thinking" and Workers focus on "doing." The Coordinator's tool set is strictly limited to four:
The Coordinator cannot read or write files, execute shell commands, or search code. It can only do three things: spawn workers, stop workers, and send messages to workers. This extreme constraint forces the Coordinator to be a pure "commander."
Enabling and Detecting Coordinator Mode
Coordinator mode is controlled via the CLAUDE_CODE_COORDINATOR_MODE environment variable:
There's a double gate here: first, the COORDINATOR_MODE compile-time feature flag must be enabled (using Bun's feature() macro for dead code elimination), then the environment variable must be truthy. This ensures that in builds that don't support Coordinator mode, all related code is completely stripped out.
When resuming a session that was previously run in Coordinator mode, the system automatically matches the mode:
This code demonstrates an interesting design choice: mode state is stored in process.env rather than in a state object, because isCoordinatorMode() is called extensively, and reading the environment variable directly avoids introducing any additional state management layer.
Coordinator's System Prompt and Task Workflow
The Coordinator's system prompt defines a complete task workflow with four phases:
The concurrency rules defined in the system prompt are highly pragmatic:
This isn't enforced through code — it relies on the LLM itself understanding and following these rules. This is a notable architectural decision: between "hardcoded concurrency control" and "trusting the LLM's judgment," Claude Code chose the latter, because file conflict scenarios are too varied and complex for hardcoded rules to cover all cases.
AgentTool's Agent Spawning Mechanism
AgentTool is the entry point for the entire multi-agent system — all worker creation goes through it. Its implementation is in src/tools/AgentTool/AgentTool.tsx and is one of the most complex tools in the entire codebase (over 4,000 lines).
Input Schema and Mode Routing
AgentTool's input schema is dynamically composed based on compile-time feature flags:
lazySchema() wrapping is used here to ensure Zod schemas are only instantiated on first use. Note that the isolation field's enum values differ based on build type — internal builds support 'worktree' | 'remote', while external builds only support 'worktree'.
Agent Type Resolution and Routing
When AgentTool.call() is invoked, it first needs to resolve the target agent's type. There are three paths:
The key routing logic is as follows:
Several important design details here:
-
Recursive fork guard: Uses dual detection —
querySource(compression-resistant since it's set at spawn time) and message scanning (fallback path), ensuring fork children don't recurse infinitely. -
Permission filtering: Agent types can be denied by permission rules (configured via
Agent(AgentName)syntax in settings), and error messages distinguish between "doesn't exist" and "denied." -
Model override: In Coordinator mode, the
modelparameter is forcibly ignored (set toundefined), because workers need the default high-capability model to complete substantive tasks.
Worker Tool Set Assembly
The Worker's tool set isn't simply inherited from the Coordinator — it's assembled independently:
Workers get their own permission mode (defaulting to 'acceptEdits'), then independently assemble available tools from the global tool pool. This means the Worker's tool set is completely independent of the Coordinator's restrictions — even though the Coordinator only has 4 management tools, Workers can still use the full range of file operations, code search, and other tools.
Worktree Isolation
When isolation: 'worktree' is specified, AgentTool creates a temporary git worktree, letting the Worker operate on an isolated copy of the codebase:
Worktree isolation provides two important benefits: Workers can freely modify code without affecting the main workspace; and multiple Workers can modify code in parallel across different worktrees. When a Worker finishes, if no changes were made in the worktree, it's automatically cleaned up; if changes exist, the worktree path and branch name are returned to the Coordinator.
The Worker's Restricted Tool Set
As asynchronously running sub-agents, Workers have a carefully designed restricted tool set. These restrictions are defined in the ASYNC_AGENT_ALLOWED_TOOLS set:
Explicitly excluded tools include:
The actual tool filtering logic is implemented in filterToolsForAgent:
There's a particularly interesting layered design here. For in-process teammates (members in Team mode), an additional set of task management tools is allowed:
This enables teammates within a Team to create tasks, update task status, and send messages to other teammates — all capabilities essential for swarm collaboration.
Task Notification Mechanism
When a Worker completes a task, it doesn't return results via a function call. Instead, it sends results to the Coordinator through a carefully designed XML-format notification injected as a user-role message into the Coordinator's conversation.
Notification Format
Why XML instead of JSON? Because the <task-notification> opening tag provides a clear, easily recognizable signal for the LLM — the Coordinator's system prompt explicitly states "distinguish notifications from user messages by the <task-notification> opening tag." XML's tag structure is easier for LLMs to recognize and parse during streaming generation than JSON's curly braces.
Notification Injection Path
Notifications are injected into the Coordinator's message stream via enqueueAgentNotification. The entire async agent lifecycle management is in the runAsyncAgentLifecycle function:
While Workers run in the background, the Coordinator can continue interacting with the user or launch more Workers. Notifications arrive as user-role messages, and the Coordinator processes them at the start of its next turn.
One-Shot Optimization
For certain built-in agents (like Explore and Plan) that only run once and won't be continued by the Coordinator via SendMessage, the notification omits the agentId and SendMessage usage instructions to save tokens:
Comments note that this optimization "saves ~135 chars x 34M Explore runs/week" — at scale, every token matters.
SendMessageTool: Cross-Agent Communication
SendMessageTool is the core tool for inter-agent communication. It's used not only for the Coordinator to send follow-up instructions to Workers but also for direct communication between teammates in Team mode.
Message Routing Logic
SendMessageTool's message routing is highly refined, handling multiple target types:
The to field can be:
- A teammate name (e.g.,
"researcher") — sends to a specific teammate "*"— broadcasts to all teammates"uds:/path/to.sock"— cross-process communication (via Unix Domain Socket)"bridge:session_..."— cross-machine communication (via Remote Control)
Sending Messages to Existing Workers
When the Coordinator uses SendMessage to send a message to a completed or running Worker, the handling logic has three branches:
Two important behaviors here:
-
Running Worker: The message is queued (
queuePendingMessage) and delivered during the Worker's next tool call turn. This avoids interrupting the Worker's current work. -
Stopped Worker: The Worker is automatically resumed (
resumeAgentBackground), loading its previous conversation context from the on-disk transcript, then continuing execution with the new message as a continuation prompt. This allows the Coordinator to repeatedly leverage an existing Worker's accumulated context.
Structured Message Protocol
Beyond plain text messages, SendMessageTool also supports structured messages for coordination operations in Team mode:
These structured messages implement three coordination protocols:
- Shutdown protocol: The Team lead sends a
shutdown_requestto a teammate, who replies with ashutdown_response(approve or deny). An approved shutdown triggers the teammate process'sgracefulShutdown. - Plan approval protocol: In plan permission mode, teammates need the Team lead's approval before executing implementation.
- Broadcast:
to: "*"broadcasts the message to all teammates, iterating through all members in the team file (excluding the sender).
Mailbox Communication Model
Message passing in Team mode is based on a mailbox model — messages are written to the recipient's mailbox file rather than pushed directly:
The benefit of this design is complete decoupling of sender and receiver — the sender doesn't need to wait for the receiver to be online; messages are delivered the next time the receiver polls its mailbox.
Scratchpad Directory: Persistent State Sharing Across Workers
How do multiple Workers share information? Claude Code provides a mechanism called the Scratchpad — a session-level temporary directory that all Workers can freely read from and write to without permission prompts.
Scratchpad Location and Permissions
The path format is /tmp/claude-{uid}/{sanitized-cwd}/{sessionId}/scratchpad/. The directory is created with 0o700 permissions (owner-only access) to ensure security.
How the Coordinator Informs Workers About the Scratchpad
Scratchpad directory information is injected into the Coordinator's context via user context:
Note the key phrase in the prompt: "structure files however fits the work" — the system doesn't prescribe file structure within the Scratchpad, letting the Coordinator and Workers organize it as the task demands. This flexibility is intentional.
Security: Path Traversal Protection
Scratchpad path detection includes path traversal protection:
The comment explicitly warns about the attack vector: without normalization, a path like /tmp/claude-0/proj/session/scratchpad/../../../etc/passwd would pass the startsWith check but actually write to /etc/passwd. The normalize() call resolves .. segments, closing this vulnerability.
Background Execution and Progress Tracking
Synchronous vs. Asynchronous Execution
AgentTool supports two execution modes: synchronous (foreground) and asynchronous (background). The logic for determining which mode to use combines multiple signals:
In Coordinator mode, all agents run asynchronously. This is because the Coordinator's core value lies in parallel orchestration — if Workers ran synchronously, the Coordinator couldn't launch multiple Workers simultaneously.
Auto-Backgrounding
There's also an auto-backgrounding mechanism — when a Worker runs for more than a certain time (120 seconds), it's automatically moved to the background:
Agent Resume Mechanism
When a stopped agent needs to be resumed, the resumeAgentBackground function is responsible for rebuilding conversation context from the on-disk transcript:
The resume process reads the agent's previous transcript (including all tool calls and results), rebuilds the message history, then adds the new prompt as a user message at the end. This gives the resumed agent complete context from its previous execution.
runAgent: The Worker's Execution Engine
runAgent is the Worker's core execution function. It's an async generator responsible for initializing MCP servers, building context, and running the query loop.
MCP Server Inheritance and Isolation
Agent definitions can declare their own MCP servers, which are incremental extensions of the parent context's MCP clients:
MCP servers can be referenced in two ways:
- String reference: References a configured MCP server by name, using the memoized
connectToServerto share the connection. - Inline definition: A
{ [name]: config }format for a brand new MCP server configuration, requiring cleanup when the agent finishes.
A key security constraint: when MCP is locked to plugin-only mode, user-controlled agents' frontmatter MCP servers are skipped, but plugin, built-in, and policySettings agents' MCP are unaffected since they come from admin-trusted sources:
The cleanup function only cleans up newly created clients; shared clients are managed by the parent context:
Agent Definitions and Tool Control
Each agent's capabilities are controlled by its AgentDefinition. Taking the built-in general-purpose agent as an example:
tools: ['*'] means using all available tools (after filtering). Custom agents can specify explicit tool lists or disallowed lists. The resolveAgentTools function handles this complex tool resolution logic:
Team System: Swarm Mode
Beyond the Coordinator/Worker pattern, Claude Code also supports a looser form of multi-agent collaboration — Team (Swarm) mode. In this mode, multiple agents work as "teammates" in parallel, collaborating through a shared task list and messaging system.
TeamCreateTool: Creating a Team
Creating a Team does the following:
Teams and Task Lists have a 1:1 correspondence — each Team has its own task list directory, with task numbers starting from 1:
TeamFile Structure
The Team lead's ID is deterministic — generated by formatAgentId(TEAM_LEAD_NAME, finalTeamName) rather than a random UUID. This allows other teammates to derive the Team lead's ID without querying any registry.
Spawning Teammates
In Team mode, teammates are spawned by passing team_name and name parameters through AgentTool. This triggers the spawnTeammate() path:
Note an important constraint — teammates cannot spawn teammates:
The Team's member list is flat — only the Team lead can add members. This prevents unbounded nesting of teammate relationships, simplifying communication and lifecycle management.
TeamDeleteTool: Cleaning Up a Team
When a Team is finished, TeamDeleteTool handles cleaning up all resources:
An important safety check: you can't delete a Team while it still has active members. All teammates must first be gracefully terminated via the SendMessage shutdown_request protocol.
Team Workflow
The complete Team workflow as described in the system prompt:
Fork Sub-Agent: Context Inheritance
Beyond Coordinator/Worker and Team swarm, there's a third multi-agent pattern — Fork. A Fork sub-agent inherits the parent agent's complete conversation context (including the system prompt and all history messages), making it suitable for tasks that don't need intermediate tool outputs retained in the parent context.
Fork and Coordinator mode are mutually exclusive — because the Coordinator already has its own orchestration model. Fork's advantages include:
- Cache-friendly: Fork sub-agents use the parent agent's exact system prompt and tool set (
useExactTools: true), so the API request prefix is identical to the parent's, enabling prompt cache reuse. - Context inheritance: No need to re-explain background in the prompt — the sub-agent already "knows" everything.
- Imperative prompts: Since context is inherited, the prompt only needs to be a "what to do" instruction, not a complete "here's the situation + what to do" description.
Agent Persistent Memory
Worker agents can have persistent memory that saves learned knowledge across sessions. The memory system supports three scopes:
Each scope has a different storage location:
- user:
~/.claude/agent-memory/{agentType}/— cross-project general knowledge - project:
.claude/agent-memory/{agentType}/— project-specific knowledge (shareable via version control) - local:
.claude/agent-memory-local/{agentType}/— machine-specific knowledge (not version-controlled)
The memory entry file is always MEMORY.md:
Transferable Patterns: Architecture Essentials for Building Multi-Agent Systems
From Claude Code's multi-agent implementation, we can distill several general-purpose architectural patterns applicable to building any multi-agent system.
Pattern 1: Role Separation and Tool Constraints
The Coordinator has only management tools; Workers have only execution tools. This hard separation prevents role confusion — the Coordinator won't be "tempted" to modify files directly, and Workers won't try to orchestrate other Workers.
The key to implementing this separation is tool set filtering: determine which tools an agent can use at spawn time, rather than relying on instructions in the system prompt. An LLM might not follow a "don't use tool X" instruction, but if the tool simply isn't in the available list, it physically cannot use it.
Pattern 2: Async Notifications Rather Than Synchronous Waiting
Worker results are returned via async notifications (<task-notification>) rather than blocking the Coordinator to wait. This allows the Coordinator to orchestrate multiple Workers simultaneously.
The notification format uses XML instead of JSON because XML tags are easier for LLMs to recognize during streaming processing. The <task-notification> opening tag provides a deterministic signal, preventing the LLM from confusing Worker results with user messages.
Pattern 3: Shared Lock-Free State
The Scratchpad directory provides cross-Worker state sharing without any locking mechanism. This works well in practice because the Coordinator typically ensures that Workers reading and writing to the same area don't run simultaneously.
This design is far simpler and less error-prone than introducing file locks — deadlocks are especially dangerous in multi-agent systems because LLMs don't have the ability to "detect and recover from deadlocks."
Pattern 4: Mailbox Communication Model
The mailbox communication model in Team mode — where the sender writes to the recipient's mailbox file — is a classic asynchronous messaging pattern. It completely decouples the execution timing of sender and receiver, naturally supporting offline messaging.
Pattern 5: Flat Membership Structure
The Team's member list is flat — only the lead can add members; teammates cannot spawn teammates. This prevents uncontrolled growth of organizational structure, simplifies the communication topology (degenerating from an arbitrary graph to a star), and reduces system complexity.
Conclusion
Claude Code's multi-agent system demonstrates a pragmatic approach to distributed AI system design. It doesn't pursue theoretical perfection — no distributed transactions, no consensus algorithms, no formal verification — but instead solves real problems with simple mechanisms:
- Role separation is enforced through tool set filtering, not just prompt instructions
- Task notifications use XML format injected as user-role messages, letting the LLM process them naturally
- State sharing works through a Scratchpad directory in the filesystem — no locks, no protocols
- Lifecycle management uses structured message protocols (shutdown_request/response) for graceful shutdown
- MCP inheritance uses a merge-plus-independent-cleanup approach, letting child agents incrementally extend parent agent capabilities
The common thread across these design choices is that they all find the balance between "good enough" and "over-engineering." In the rapidly evolving field of AI agent systems, this pragmatic engineering philosophy may be more valuable than pursuing perfect architecture.