Multi-Agent Orchestration: How the Coordinator Pattern Enables AI Collaboration

A deep dive into Claude Code's multi-agent architecture — Coordinator/Worker role separation, AgentTool spawning mechanics, task notifications, Scratchpad sharing, and Team swarm mode

The Problem

What do you do when a single AI agent isn't enough? The intuitive answer is: spawn more agents. But the problem is far from that simple. How do multiple agents divide work? How do they communicate? How do they share context without conflicts? When a worker finishes a task, how does it report results to the coordinator? If a worker heads in the wrong direction, how do you terminate it promptly?

At the heart of these questions lies a classic distributed systems design challenge — except here the "nodes" aren't servers but LLM instances. Claude Code answers these questions with an elegant multi-agent architecture whose core pattern is called Coordinator/Worker.

This article dives deep into Claude Code's multi-agent system, starting from the overall architecture of the Coordinator pattern and analyzing layer by layer: AgentTool's agent spawning mechanism, the Worker's restricted tool set, the XML protocol for task notifications, SendMessageTool's cross-agent communication, Scratchpad directory persistent state sharing, background execution and progress tracking, Team swarm mode, and MCP Server inheritance and isolation across agents.

Overall Architecture of the Coordinator Pattern

Claude Code's multi-agent system is built on a clear role separation model: the Coordinator is responsible for understanding user intent, decomposing tasks, and synthesizing results; Workers are responsible for executing concrete work (research, implementation, verification). The Coordinator doesn't directly operate on files or run commands — it only has a minimal set of "management tools."

The Design Philosophy of Role Separation

The philosophy behind this design is simple: let the Coordinator focus on "thinking" and Workers focus on "doing." The Coordinator's tool set is strictly limited to four:

src/constants/tools.ts:107-112
TypeScript
107export const COORDINATOR_MODE_ALLOWED_TOOLS = new Set([
108 AGENT_TOOL_NAME, // 'Agent' — spawn new workers
109 TASK_STOP_TOOL_NAME, // 'TaskStop' — stop running workers
110 SEND_MESSAGE_TOOL_NAME, // 'SendMessage' — send messages to existing workers
111 SYNTHETIC_OUTPUT_TOOL_NAME, // 'SyntheticOutput' — internal output tool
112])

The Coordinator cannot read or write files, execute shell commands, or search code. It can only do three things: spawn workers, stop workers, and send messages to workers. This extreme constraint forces the Coordinator to be a pure "commander."

Enabling and Detecting Coordinator Mode

Coordinator mode is controlled via the CLAUDE_CODE_COORDINATOR_MODE environment variable:

src/coordinator/coordinatorMode.ts:36-41
TypeScript
36export function isCoordinatorMode(): boolean {
37 if (feature('COORDINATOR_MODE')) {
38 return isEnvTruthy(process.env.CLAUDE_CODE_COORDINATOR_MODE)
39 }
40 return false
41}

There's a double gate here: first, the COORDINATOR_MODE compile-time feature flag must be enabled (using Bun's feature() macro for dead code elimination), then the environment variable must be truthy. This ensures that in builds that don't support Coordinator mode, all related code is completely stripped out.

When resuming a session that was previously run in Coordinator mode, the system automatically matches the mode:

src/coordinator/coordinatorMode.ts:49-78
TypeScript
49export function matchSessionMode(
50 sessionMode: 'coordinator' | 'normal' | undefined,
51): string | undefined {
52 if (!sessionMode) {
53 return undefined
54 }
55 const currentIsCoordinator = isCoordinatorMode()
56 const sessionIsCoordinator = sessionMode === 'coordinator'
57
58 if (currentIsCoordinator === sessionIsCoordinator) {
59 return undefined
60 }
61
62 // Flip the env var — isCoordinatorMode() reads it live, no caching
63 if (sessionIsCoordinator) {
64 process.env.CLAUDE_CODE_COORDINATOR_MODE = '1'
65 } else {
66 delete process.env.CLAUDE_CODE_COORDINATOR_MODE
67 }
68
69 return sessionIsCoordinator
70 ? 'Entered coordinator mode to match resumed session.'
71 : 'Exited coordinator mode to match resumed session.'
72}

This code demonstrates an interesting design choice: mode state is stored in process.env rather than in a state object, because isCoordinatorMode() is called extensively, and reading the environment variable directly avoids introducing any additional state management layer.

Coordinator's System Prompt and Task Workflow

The Coordinator's system prompt defines a complete task workflow with four phases:

...

The concurrency rules defined in the system prompt are highly pragmatic:

src/coordinator/coordinatorMode.ts:213-219
TypeScript
213// Manage concurrency:
214// - **Read-only tasks** (research) — run in parallel freely
215// - **Write-heavy tasks** (implementation) — one at a time per set of files
216// - **Verification** can sometimes run alongside implementation on different file areas

This isn't enforced through code — it relies on the LLM itself understanding and following these rules. This is a notable architectural decision: between "hardcoded concurrency control" and "trusting the LLM's judgment," Claude Code chose the latter, because file conflict scenarios are too varied and complex for hardcoded rules to cover all cases.

AgentTool's Agent Spawning Mechanism

AgentTool is the entry point for the entire multi-agent system — all worker creation goes through it. Its implementation is in src/tools/AgentTool/AgentTool.tsx and is one of the most complex tools in the entire codebase (over 4,000 lines).

Input Schema and Mode Routing

AgentTool's input schema is dynamically composed based on compile-time feature flags:

src/tools/AgentTool/AgentTool.tsx:82-101
TypeScript
82const baseInputSchema = lazySchema(() => z.object({
83 description: z.string().describe('A short (3-5 word) description of the task'),
84 prompt: z.string().describe('The task for the agent to perform'),
85 subagent_type: z.string().optional()
86 .describe('The type of specialized agent to use for this task'),
87 model: z.enum(['sonnet', 'opus', 'haiku']).optional()
88 .describe("Optional model override for this agent."),
89 run_in_background: z.boolean().optional()
90 .describe('Set to true to run this agent in the background.')
91}));
92
93const fullInputSchema = lazySchema(() => {
94 const multiAgentInputSchema = z.object({
95 name: z.string().optional()
96 .describe('Name for the spawned agent.'),
97 team_name: z.string().optional()
98 .describe('Team name for spawning.'),
99 mode: permissionModeSchema().optional()
100 .describe('Permission mode for spawned teammate.'),
101 });
102 return baseInputSchema().merge(multiAgentInputSchema).extend({
103 isolation: z.enum(['worktree']).optional()
104 .describe('Isolation mode. "worktree" creates a temporary git worktree.'),
105 cwd: z.string().optional()
106 .describe('Absolute path to run the agent in.')
107 });
108});

lazySchema() wrapping is used here to ensure Zod schemas are only instantiated on first use. Note that the isolation field's enum values differ based on build type — internal builds support 'worktree' | 'remote', while external builds only support 'worktree'.

Agent Type Resolution and Routing

When AgentTool.call() is invoked, it first needs to resolve the target agent's type. There are three paths:

...

The key routing logic is as follows:

src/tools/AgentTool/AgentTool.tsx:322-356
TypeScript
322const effectiveType = subagent_type
323 ?? (isForkSubagentEnabled() ? undefined : GENERAL_PURPOSE_AGENT.agentType);
324const isForkPath = effectiveType === undefined;
325
326let selectedAgent: AgentDefinition;
327if (isForkPath) {
328 // Recursive fork guard: fork children cannot fork again
329 if (toolUseContext.options.querySource ===
330 `agent:builtin:${FORK_AGENT.agentType}`
331 || isInForkChild(toolUseContext.messages)) {
332 throw new Error(
333 'Fork is not available inside a forked worker.'
334 );
335 }
336 selectedAgent = FORK_AGENT;
337} else {
338 const allAgents = toolUseContext.options.agentDefinitions.activeAgents;
339 const agents = filterDeniedAgents(allAgents,
340 appState.toolPermissionContext, AGENT_TOOL_NAME);
341 const found = agents.find(agent => agent.agentType === effectiveType);
342 if (!found) {
343 // Distinguish between "doesn't exist" and "denied by permission rule"
344 const agentExistsButDenied = allAgents.find(
345 agent => agent.agentType === effectiveType
346 );
347 if (agentExistsButDenied) {
348 const denyRule = getDenyRuleForAgent(
349 appState.toolPermissionContext, AGENT_TOOL_NAME, effectiveType
350 );
351 throw new Error(
352 `Agent type '${effectiveType}' has been denied by permission rule.`
353 );
354 }
355 throw new Error(`Agent type '${effectiveType}' not found.`);
356 }
357 selectedAgent = found;
358}

Several important design details here:

  1. Recursive fork guard: Uses dual detection — querySource (compression-resistant since it's set at spawn time) and message scanning (fallback path), ensuring fork children don't recurse infinitely.

  2. Permission filtering: Agent types can be denied by permission rules (configured via Agent(AgentName) syntax in settings), and error messages distinguish between "doesn't exist" and "denied."

  3. Model override: In Coordinator mode, the model parameter is forcibly ignored (set to undefined), because workers need the default high-capability model to complete substantive tasks.

Worker Tool Set Assembly

The Worker's tool set isn't simply inherited from the Coordinator — it's assembled independently:

src/tools/AgentTool/AgentTool.tsx:573-577
TypeScript
573const workerPermissionContext = {
574 ...appState.toolPermissionContext,
575 mode: selectedAgent.permissionMode ?? 'acceptEdits'
576};
577const workerTools = assembleToolPool(
578 workerPermissionContext, appState.mcp.tools
579);

Workers get their own permission mode (defaulting to 'acceptEdits'), then independently assemble available tools from the global tool pool. This means the Worker's tool set is completely independent of the Coordinator's restrictions — even though the Coordinator only has 4 management tools, Workers can still use the full range of file operations, code search, and other tools.

Worktree Isolation

When isolation: 'worktree' is specified, AgentTool creates a temporary git worktree, letting the Worker operate on an isolated copy of the codebase:

src/tools/AgentTool/AgentTool.tsx:590-593
TypeScript
590if (effectiveIsolation === 'worktree') {
591 const slug = `agent-${earlyAgentId.slice(0, 8)}`;
592 worktreeInfo = await createAgentWorktree(slug);
593}

Worktree isolation provides two important benefits: Workers can freely modify code without affecting the main workspace; and multiple Workers can modify code in parallel across different worktrees. When a Worker finishes, if no changes were made in the worktree, it's automatically cleaned up; if changes exist, the worktree path and branch name are returned to the Coordinator.

The Worker's Restricted Tool Set

As asynchronously running sub-agents, Workers have a carefully designed restricted tool set. These restrictions are defined in the ASYNC_AGENT_ALLOWED_TOOLS set:

src/constants/tools.ts:55-71
TypeScript
55export const ASYNC_AGENT_ALLOWED_TOOLS = new Set([
56 FILE_READ_TOOL_NAME, // Read files
57 WEB_SEARCH_TOOL_NAME, // Web search
58 TODO_WRITE_TOOL_NAME, // Write todos
59 GREP_TOOL_NAME, // Content search
60 WEB_FETCH_TOOL_NAME, // Fetch web pages
61 GLOB_TOOL_NAME, // File pattern matching
62 ...SHELL_TOOL_NAMES, // Bash / PowerShell
63 FILE_EDIT_TOOL_NAME, // Edit files
64 FILE_WRITE_TOOL_NAME, // Write files
65 NOTEBOOK_EDIT_TOOL_NAME, // Edit notebooks
66 SKILL_TOOL_NAME, // Skill invocation
67 SYNTHETIC_OUTPUT_TOOL_NAME,
68 TOOL_SEARCH_TOOL_NAME, // Tool search
69 ENTER_WORKTREE_TOOL_NAME, // Enter worktree
70 EXIT_WORKTREE_TOOL_NAME, // Exit worktree
71])

Explicitly excluded tools include:

src/constants/tools.ts:36-46
TypeScript
36export const ALL_AGENT_DISALLOWED_TOOLS = new Set([
37 TASK_OUTPUT_TOOL_NAME, // Prevent recursion
38 EXIT_PLAN_MODE_V2_TOOL_NAME, // Plan mode is a main thread abstraction
39 ENTER_PLAN_MODE_TOOL_NAME,
40 AGENT_TOOL_NAME, // Prevent agent recursive spawning (except ant users)
41 ASK_USER_QUESTION_TOOL_NAME, // Async workers cannot ask users questions
42 TASK_STOP_TOOL_NAME, // Requires main thread task state
43 WORKFLOW_TOOL_NAME, // Prevent workflow recursion
44])

The actual tool filtering logic is implemented in filterToolsForAgent:

src/tools/AgentTool/agentToolUtils.ts:70-116
TypeScript
70export function filterToolsForAgent({
71 tools, isBuiltIn, isAsync = false, permissionMode,
72}: { tools: Tools; isBuiltIn: boolean; isAsync?: boolean;
73 permissionMode?: PermissionMode }): Tools {
74 return tools.filter(tool => {
75 // MCP tools are always allowed
76 if (tool.name.startsWith('mcp__')) {
77 return true
78 }
79 // Allow ExitPlanMode in plan mode
80 if (toolMatchesName(tool, EXIT_PLAN_MODE_V2_TOOL_NAME)
81 && permissionMode === 'plan') {
82 return true
83 }
84 // Global disallow list
85 if (ALL_AGENT_DISALLOWED_TOOLS.has(tool.name)) {
86 return false
87 }
88 // Additional disallow list for custom agents
89 if (!isBuiltIn && CUSTOM_AGENT_DISALLOWED_TOOLS.has(tool.name)) {
90 return false
91 }
92 // Async agent allowlist filtering
93 if (isAsync && !ASYNC_AGENT_ALLOWED_TOOLS.has(tool.name)) {
94 // Special case: in-process teammates can use AgentTool and task tools
95 if (isAgentSwarmsEnabled() && isInProcessTeammate()) {
96 if (toolMatchesName(tool, AGENT_TOOL_NAME)) {
97 return true
98 }
99 if (IN_PROCESS_TEAMMATE_ALLOWED_TOOLS.has(tool.name)) {
100 return true
101 }
102 }
103 return false
104 }
105 return true
106 })
107}

There's a particularly interesting layered design here. For in-process teammates (members in Team mode), an additional set of task management tools is allowed:

src/constants/tools.ts:77-88
TypeScript
77export const IN_PROCESS_TEAMMATE_ALLOWED_TOOLS = new Set([
78 TASK_CREATE_TOOL_NAME,
79 TASK_GET_TOOL_NAME,
80 TASK_LIST_TOOL_NAME,
81 TASK_UPDATE_TOOL_NAME,
82 SEND_MESSAGE_TOOL_NAME,
83])

This enables teammates within a Team to create tasks, update task status, and send messages to other teammates — all capabilities essential for swarm collaboration.

Task Notification Mechanism

When a Worker completes a task, it doesn't return results via a function call. Instead, it sends results to the Coordinator through a carefully designed XML-format notification injected as a user-role message into the Coordinator's conversation.

Notification Format

XML
1<task-notification>
2 <task-id>{agentId}</task-id>
3 <status>completed|failed|killed</status>
4 <summary>{human-readable status summary}</summary>
5 <result>{agent's final text response}</result>
6 <usage>
7 <total_tokens>N</total_tokens>
8 <tool_uses>N</tool_uses>
9 <duration_ms>N</duration_ms>
10 </usage>
11</task-notification>

Why XML instead of JSON? Because the <task-notification> opening tag provides a clear, easily recognizable signal for the LLM — the Coordinator's system prompt explicitly states "distinguish notifications from user messages by the <task-notification> opening tag." XML's tag structure is easier for LLMs to recognize and parse during streaming generation than JSON's curly braces.

Notification Injection Path

Notifications are injected into the Coordinator's message stream via enqueueAgentNotification. The entire async agent lifecycle management is in the runAsyncAgentLifecycle function:

sequenceDiagram
    participant Coord as Coordinator
    participant AT as AgentTool
    participant Worker as Worker Agent
    participant Task as Task Registry

    Coord->>AT: Agent({ prompt: "...", run_in_background: true })
    AT->>Task: registerAsyncAgent(agentId)
    AT->>Worker: runAgent() (launch in background)
    AT-->>Coord: { status: "async_launched", agentId: "..." }

    Note over Worker: Worker executes task...
    Worker->>Worker: Uses tools (Read/Edit/Bash...)

    alt Completed successfully
        Worker->>Task: completeAsyncAgent(result)
        Task->>Coord: enqueueAgentNotification(XML)
    else Execution failed
        Worker->>Task: failAsyncAgent(error)
        Task->>Coord: enqueueAgentNotification(XML)
    else Killed
        Coord->>Task: killAsyncAgent(agentId)
        Task->>Coord: enqueueAgentNotification(XML)
    end

While Workers run in the background, the Coordinator can continue interacting with the user or launch more Workers. Notifications arrive as user-role messages, and the Coordinator processes them at the start of its next turn.

One-Shot Optimization

For certain built-in agents (like Explore and Plan) that only run once and won't be continued by the Coordinator via SendMessage, the notification omits the agentId and SendMessage usage instructions to save tokens:

src/tools/AgentTool/constants.ts:9-12
TypeScript
9export const ONE_SHOT_BUILTIN_AGENT_TYPES: ReadonlySet<string> = new Set([
10 'Explore',
11 'Plan',
12])

Comments note that this optimization "saves ~135 chars x 34M Explore runs/week" — at scale, every token matters.

SendMessageTool: Cross-Agent Communication

SendMessageTool is the core tool for inter-agent communication. It's used not only for the Coordinator to send follow-up instructions to Workers but also for direct communication between teammates in Team mode.

Message Routing Logic

SendMessageTool's message routing is highly refined, handling multiple target types:

src/tools/SendMessageTool/SendMessageTool.ts:67-87
TypeScript
67const inputSchema = lazySchema(() =>
68 z.object({
69 to: z.string().describe(
70 'Recipient: teammate name, or "*" for broadcast to all teammates'
71 ),
72 summary: z.string().optional().describe(
73 'A 5-10 word summary shown as a preview in the UI'
74 ),
75 message: z.union([
76 z.string().describe('Plain text message content'),
77 StructuredMessage(),
78 ]),
79 }),
80)

The to field can be:

  • A teammate name (e.g., "researcher") — sends to a specific teammate
  • "*" — broadcasts to all teammates
  • "uds:/path/to.sock" — cross-process communication (via Unix Domain Socket)
  • "bridge:session_..." — cross-machine communication (via Remote Control)

Sending Messages to Existing Workers

When the Coordinator uses SendMessage to send a message to a completed or running Worker, the handling logic has three branches:

src/tools/SendMessageTool/SendMessageTool.ts:802-873
TypeScript
802if (typeof input.message === 'string' && input.to !== '*') {
803 const appState = context.getAppState()
804 const registered = appState.agentNameRegistry.get(input.to)
805 const agentId = registered ?? toAgentId(input.to)
806
807 if (agentId) {
808 const task = appState.tasks[agentId]
809 if (isLocalAgentTask(task) && !isMainSessionTask(task)) {
810 if (task.status === 'running') {
811 // Worker still running: queue message for delivery on next tool turn
812 queuePendingMessage(agentId, input.message, ...)
813 return { data: {
814 success: true,
815 message: `Message queued for delivery to ${input.to}.`
816 }}
817 }
818 // Worker has stopped: auto-resume
819 const result = await resumeAgentBackground({
820 agentId, prompt: input.message, ...
821 })
822 return { data: {
823 success: true,
824 message: `Agent "${input.to}" was stopped; resumed it in background.`
825 }}
826 }
827 }
828}

Two important behaviors here:

  1. Running Worker: The message is queued (queuePendingMessage) and delivered during the Worker's next tool call turn. This avoids interrupting the Worker's current work.

  2. Stopped Worker: The Worker is automatically resumed (resumeAgentBackground), loading its previous conversation context from the on-disk transcript, then continuing execution with the new message as a continuation prompt. This allows the Coordinator to repeatedly leverage an existing Worker's accumulated context.

Structured Message Protocol

Beyond plain text messages, SendMessageTool also supports structured messages for coordination operations in Team mode:

src/tools/SendMessageTool/SendMessageTool.ts:46-65
TypeScript
46const StructuredMessage = lazySchema(() =>
47 z.discriminatedUnion('type', [
48 z.object({
49 type: z.literal('shutdown_request'),
50 reason: z.string().optional(),
51 }),
52 z.object({
53 type: z.literal('shutdown_response'),
54 request_id: z.string(),
55 approve: semanticBoolean(),
56 reason: z.string().optional(),
57 }),
58 z.object({
59 type: z.literal('plan_approval_response'),
60 request_id: z.string(),
61 approve: semanticBoolean(),
62 feedback: z.string().optional(),
63 }),
64 ]),
65)

These structured messages implement three coordination protocols:

  • Shutdown protocol: The Team lead sends a shutdown_request to a teammate, who replies with a shutdown_response (approve or deny). An approved shutdown triggers the teammate process's gracefulShutdown.
  • Plan approval protocol: In plan permission mode, teammates need the Team lead's approval before executing implementation.
  • Broadcast: to: "*" broadcasts the message to all teammates, iterating through all members in the team file (excluding the sender).

Mailbox Communication Model

Message passing in Team mode is based on a mailbox model — messages are written to the recipient's mailbox file rather than pushed directly:

src/tools/SendMessageTool/SendMessageTool.ts:161-170
TypeScript
161await writeToMailbox(
162 recipientName,
163 {
164 from: senderName,
165 text: content,
166 summary,
167 timestamp: new Date().toISOString(),
168 color: senderColor,
169 },
170 teamName,
171)

The benefit of this design is complete decoupling of sender and receiver — the sender doesn't need to wait for the receiver to be online; messages are delivered the next time the receiver polls its mailbox.

Scratchpad Directory: Persistent State Sharing Across Workers

How do multiple Workers share information? Claude Code provides a mechanism called the Scratchpad — a session-level temporary directory that all Workers can freely read from and write to without permission prompts.

Scratchpad Location and Permissions

src/utils/permissions/filesystem.ts:384-386
TypeScript
384export function getScratchpadDir(): string {
385 return join(getProjectTempDir(), getSessionId(), 'scratchpad')
386}

The path format is /tmp/claude-{uid}/{sanitized-cwd}/{sessionId}/scratchpad/. The directory is created with 0o700 permissions (owner-only access) to ensure security.

How the Coordinator Informs Workers About the Scratchpad

Scratchpad directory information is injected into the Coordinator's context via user context:

src/coordinator/coordinatorMode.ts:80-108
TypeScript
80export function getCoordinatorUserContext(
81 mcpClients: ReadonlyArray<{ name: string }>,
82 scratchpadDir?: string,
83): { [k: string]: string } {
84 if (!isCoordinatorMode()) {
85 return {}
86 }
87
88 let content = `Workers spawned via the ${AGENT_TOOL_NAME} tool have ` +
89 `access to these tools: ${workerTools}`
90
91 if (mcpClients.length > 0) {
92 const serverNames = mcpClients.map(c => c.name).join(', ')
93 content += `\n\nWorkers also have access to MCP tools from ` +
94 `connected MCP servers: ${serverNames}`
95 }
96
97 if (scratchpadDir && isScratchpadGateEnabled()) {
98 content += `\n\nScratchpad directory: ${scratchpadDir}\n` +
99 `Workers can read and write here without permission prompts. ` +
100 `Use this for durable cross-worker knowledge — ` +
101 `structure files however fits the work.`
102 }
103
104 return { workerToolsContext: content }
105}

Note the key phrase in the prompt: "structure files however fits the work" — the system doesn't prescribe file structure within the Scratchpad, letting the Coordinator and Workers organize it as the task demands. This flexibility is intentional.

Security: Path Traversal Protection

Scratchpad path detection includes path traversal protection:

src/utils/permissions/filesystem.ts:410-423
TypeScript
410function isScratchpadPath(absolutePath: string): boolean {
411 if (!isScratchpadEnabled()) {
412 return false
413 }
414 const scratchpadDir = getScratchpadDir()
415 // SECURITY: Normalize the path to resolve .. segments before checking
416 const normalizedPath = normalize(absolutePath)
417 return (
418 normalizedPath === scratchpadDir ||
419 normalizedPath.startsWith(scratchpadDir + sep)
420 )
421}

The comment explicitly warns about the attack vector: without normalization, a path like /tmp/claude-0/proj/session/scratchpad/../../../etc/passwd would pass the startsWith check but actually write to /etc/passwd. The normalize() call resolves .. segments, closing this vulnerability.

Background Execution and Progress Tracking

Synchronous vs. Asynchronous Execution

AgentTool supports two execution modes: synchronous (foreground) and asynchronous (background). The logic for determining which mode to use combines multiple signals:

src/tools/AgentTool/AgentTool.tsx:557-567
TypeScript
557const shouldRunAsync = (
558 run_in_background === true || // Explicitly requested background
559 selectedAgent.background === true || // Agent definition requires background
560 isCoordinator || // All async in Coordinator mode
561 forceAsync || // All async in Fork experiment mode
562 assistantForceAsync || // Force async in Assistant mode
563 (proactiveModule?.isProactiveActive() ?? false) // Proactive mode
564) && !isBackgroundTasksDisabled; // Global disable switch

In Coordinator mode, all agents run asynchronously. This is because the Coordinator's core value lies in parallel orchestration — if Workers ran synchronously, the Coordinator couldn't launch multiple Workers simultaneously.

Auto-Backgrounding

There's also an auto-backgrounding mechanism — when a Worker runs for more than a certain time (120 seconds), it's automatically moved to the background:

src/tools/AgentTool/AgentTool.tsx:72-77
TypeScript
72function getAutoBackgroundMs(): number {
73 if (isEnvTruthy(process.env.CLAUDE_AUTO_BACKGROUND_TASKS)
74 || getFeatureValue_CACHED_MAY_BE_STALE(
75 'tengu_auto_background_agents', false)) {
76 return 120_000;
77 }
78 return 0;
79}

Agent Resume Mechanism

When a stopped agent needs to be resumed, the resumeAgentBackground function is responsible for rebuilding conversation context from the on-disk transcript:

src/tools/AgentTool/resumeAgent.ts:42-60
TypeScript
42export async function resumeAgentBackground({
43 agentId,
44 prompt,
45 toolUseContext,
46 canUseTool,
47 invokingRequestId,
48}: {
49 agentId: string
50 prompt: string
51 toolUseContext: ToolUseContext
52 canUseTool: CanUseToolFn
53 invokingRequestId?: string
54}): Promise<ResumeAgentResult> {
55 const startTime = Date.now()
56 const appState = toolUseContext.getAppState()
57 const rootSetAppState =
58 toolUseContext.setAppStateForTasks ?? toolUseContext.setAppState
59 // ...
60}

The resume process reads the agent's previous transcript (including all tool calls and results), rebuilds the message history, then adds the new prompt as a user message at the end. This gives the resumed agent complete context from its previous execution.

runAgent: The Worker's Execution Engine

runAgent is the Worker's core execution function. It's an async generator responsible for initializing MCP servers, building context, and running the query loop.

MCP Server Inheritance and Isolation

Agent definitions can declare their own MCP servers, which are incremental extensions of the parent context's MCP clients:

src/tools/AgentTool/runAgent.ts:95-110
TypeScript
95async function initializeAgentMcpServers(
96 agentDefinition: AgentDefinition,
97 parentClients: MCPServerConnection[],
98): Promise<{
99 clients: MCPServerConnection[]
100 tools: Tools
101 cleanup: () => Promise<void>
102}> {
103 if (!agentDefinition.mcpServers?.length) {
104 return {
105 clients: parentClients, // No custom MCP: directly inherit parent clients
106 tools: [],
107 cleanup: async () => {},
108 }
109 }
110 // ...
111}

MCP servers can be referenced in two ways:

  1. String reference: References a configured MCP server by name, using the memoized connectToServer to share the connection.
  2. Inline definition: A { [name]: config } format for a brand new MCP server configuration, requiring cleanup when the agent finishes.
src/tools/AgentTool/runAgent.ts:135-175
TypeScript
135for (const spec of agentDefinition.mcpServers) {
136 if (typeof spec === 'string') {
137 // Reference by name — use memoized connectToServer to share connection
138 name = spec
139 config = getMcpConfigByName(spec)
140 } else {
141 // Inline definition — agent-exclusive, needs cleanup on exit
142 const [serverName, serverConfig] = Object.entries(spec)[0]!
143 name = serverName
144 config = { ...serverConfig, scope: 'dynamic' }
145 isNewlyCreated = true
146 }
147
148 const client = await connectToServer(name, config)
149 agentClients.push(client)
150 if (isNewlyCreated) {
151 newlyCreatedClients.push(client)
152 }
153}

A key security constraint: when MCP is locked to plugin-only mode, user-controlled agents' frontmatter MCP servers are skipped, but plugin, built-in, and policySettings agents' MCP are unaffected since they come from admin-trusted sources:

src/tools/AgentTool/runAgent.ts:117-127
TypeScript
117const agentIsAdminTrusted = isSourceAdminTrusted(agentDefinition.source)
118if (isRestrictedToPluginOnly('mcp') && !agentIsAdminTrusted) {
119 logForDebugging(
120 `[Agent: ${agentDefinition.agentType}] Skipping MCP servers: ` +
121 `strictPluginOnlyCustomization locks MCP to plugin-only`
122 )
123 return { clients: parentClients, tools: [], cleanup: async () => {} }
124}

The cleanup function only cleans up newly created clients; shared clients are managed by the parent context:

src/tools/AgentTool/runAgent.ts:197-210
TypeScript
197const cleanup = async () => {
198 for (const client of newlyCreatedClients) {
199 if (client.type === 'connected') {
200 try {
201 await client.cleanup()
202 } catch (error) {
203 logForDebugging(
204 `Error cleaning up MCP server '${client.name}': ${error}`
205 )
206 }
207 }
208 }
209}
210
211return {
212 clients: [...parentClients, ...agentClients], // Merge parent + agent-specific
213 tools: agentTools,
214 cleanup,
215}

Agent Definitions and Tool Control

Each agent's capabilities are controlled by its AgentDefinition. Taking the built-in general-purpose agent as an example:

src/tools/AgentTool/built-in/generalPurposeAgent.ts:25-34
TypeScript
25export const GENERAL_PURPOSE_AGENT: BuiltInAgentDefinition = {
26 agentType: 'general-purpose',
27 whenToUse: 'General-purpose agent for researching complex questions...',
28 tools: ['*'], // Use all available tools
29 source: 'built-in',
30 baseDir: 'built-in',
31 // model intentionally omitted — uses getDefaultSubagentModel()
32 getSystemPrompt: getGeneralPurposeSystemPrompt,
33}

tools: ['*'] means using all available tools (after filtering). Custom agents can specify explicit tool lists or disallowed lists. The resolveAgentTools function handles this complex tool resolution logic:

src/tools/AgentTool/agentToolUtils.ts:122-173
TypeScript
122export function resolveAgentTools(
123 agentDefinition, availableTools, isAsync = false, isMainThread = false,
124): ResolvedAgentTools {
125 const filteredAvailableTools = isMainThread
126 ? availableTools
127 : filterToolsForAgent({
128 tools: availableTools,
129 isBuiltIn: source === 'built-in',
130 isAsync,
131 permissionMode,
132 })
133
134 // Create disallowed tool set
135 const disallowedToolSet = new Set(
136 disallowedTools?.map(toolSpec => {
137 const { toolName } = permissionRuleValueFromString(toolSpec)
138 return toolName
139 }) ?? [],
140 )
141
142 // Filter
143 const allowedAvailableTools = filteredAvailableTools.filter(
144 tool => !disallowedToolSet.has(tool.name),
145 )
146
147 // Wildcard handling
148 const hasWildcard = agentTools === undefined
149 || (agentTools.length === 1 && agentTools[0] === '*')
150 if (hasWildcard) {
151 return {
152 hasWildcard: true,
153 validTools: [],
154 invalidTools: [],
155 resolvedTools: allowedAvailableTools,
156 }
157 }
158 // ...
159}

Team System: Swarm Mode

Beyond the Coordinator/Worker pattern, Claude Code also supports a looser form of multi-agent collaboration — Team (Swarm) mode. In this mode, multiple agents work as "teammates" in parallel, collaborating through a shared task list and messaging system.

TeamCreateTool: Creating a Team

src/tools/TeamCreateTool/TeamCreateTool.ts:37-49
TypeScript
37const inputSchema = lazySchema(() =>
38 z.strictObject({
39 team_name: z.string()
40 .describe('Name for the new team to create.'),
41 description: z.string().optional()
42 .describe('Team description/purpose.'),
43 agent_type: z.string().optional()
44 .describe('Type/role of the team lead.'),
45 }),
46)

Creating a Team does the following:

...

Teams and Task Lists have a 1:1 correspondence — each Team has its own task list directory, with task numbers starting from 1:

src/tools/TeamCreateTool/TeamCreateTool.ts:182-191
TypeScript
182const taskListId = sanitizeName(finalTeamName)
183await resetTaskList(taskListId)
184await ensureTasksDir(taskListId)
185
186// Register team name so getTaskListId() returns it
187setLeaderTeamName(sanitizeName(finalTeamName))

TeamFile Structure

src/tools/TeamCreateTool/TeamCreateTool.ts:157-175
TypeScript
157const teamFile: TeamFile = {
158 name: finalTeamName,
159 description: _description,
160 createdAt: Date.now(),
161 leadAgentId,
162 leadSessionId: getSessionId(),
163 members: [
164 {
165 agentId: leadAgentId,
166 name: TEAM_LEAD_NAME, // 'team-lead'
167 agentType: leadAgentType,
168 model: leadModel,
169 joinedAt: Date.now(),
170 tmuxPaneId: '',
171 cwd: getCwd(),
172 subscriptions: [],
173 },
174 ],
175}

The Team lead's ID is deterministic — generated by formatAgentId(TEAM_LEAD_NAME, finalTeamName) rather than a random UUID. This allows other teammates to derive the Team lead's ID without querying any registry.

Spawning Teammates

In Team mode, teammates are spawned by passing team_name and name parameters through AgentTool. This triggers the spawnTeammate() path:

src/tools/AgentTool/AgentTool.tsx:284-316
TypeScript
284if (teamName && name) {
285 const result = await spawnTeammate({
286 name,
287 prompt,
288 description,
289 team_name: teamName,
290 use_splitpane: true,
291 plan_mode_required: spawnMode === 'plan',
292 model: model ?? agentDef?.model,
293 agent_type: subagent_type,
294 invokingRequestId: assistantMessage?.requestId
295 }, toolUseContext);
296
297 const spawnResult: TeammateSpawnedOutput = {
298 status: 'teammate_spawned' as const,
299 prompt,
300 ...result.data
301 };
302 // ...
303}

Note an important constraint — teammates cannot spawn teammates:

src/tools/AgentTool/AgentTool.tsx:272-274
TypeScript
272if (isTeammate() && teamName && name) {
273 throw new Error(
274 'Teammates cannot spawn other teammates — the team roster is flat.'
275 );
276}

The Team's member list is flat — only the Team lead can add members. This prevents unbounded nesting of teammate relationships, simplifying communication and lifecycle management.

TeamDeleteTool: Cleaning Up a Team

When a Team is finished, TeamDeleteTool handles cleaning up all resources:

src/tools/TeamDeleteTool/TeamDeleteTool.ts:71-135
TypeScript
71async call(_input, context) {
72 const appState = getAppState()
73 const teamName = appState.teamContext?.teamName
74
75 if (teamName) {
76 const teamFile = readTeamFile(teamName)
77 if (teamFile) {
78 // Only check truly active members (filter out idle/dead)
79 const nonLeadMembers = teamFile.members.filter(
80 m => m.name !== TEAM_LEAD_NAME
81 )
82 const activeMembers = nonLeadMembers.filter(
83 m => m.isActive !== false
84 )
85 if (activeMembers.length > 0) {
86 throw new Error(
87 `Cannot cleanup team with ${activeMembers.length} active member(s).`
88 )
89 }
90 }
91 await cleanupTeamDirectories(teamName)
92 unregisterTeamForSessionCleanup(teamName)
93 clearTeammateColors()
94 clearLeaderTeamName()
95 }
96
97 // Clear team context and inbox from AppState
98 setAppState(prev => ({
99 ...prev,
100 teamContext: undefined,
101 inbox: { messages: [] },
102 }))
103}

An important safety check: you can't delete a Team while it still has active members. All teammates must first be gracefully terminated via the SendMessage shutdown_request protocol.

Team Workflow

The complete Team workflow as described in the system prompt:

...

Fork Sub-Agent: Context Inheritance

Beyond Coordinator/Worker and Team swarm, there's a third multi-agent pattern — Fork. A Fork sub-agent inherits the parent agent's complete conversation context (including the system prompt and all history messages), making it suitable for tasks that don't need intermediate tool outputs retained in the parent context.

src/tools/AgentTool/forkSubagent.ts:32-39
TypeScript
32export function isForkSubagentEnabled(): boolean {
33 if (feature('FORK_SUBAGENT')) {
34 if (isCoordinatorMode()) return false // Mutually exclusive with Coordinator mode
35 if (getIsNonInteractiveSession()) return false // Not supported in non-interactive sessions
36 return true
37 }
38 return false
39}

Fork and Coordinator mode are mutually exclusive — because the Coordinator already has its own orchestration model. Fork's advantages include:

  1. Cache-friendly: Fork sub-agents use the parent agent's exact system prompt and tool set (useExactTools: true), so the API request prefix is identical to the parent's, enabling prompt cache reuse.
  2. Context inheritance: No need to re-explain background in the prompt — the sub-agent already "knows" everything.
  3. Imperative prompts: Since context is inherited, the prompt only needs to be a "what to do" instruction, not a complete "here's the situation + what to do" description.

Agent Persistent Memory

Worker agents can have persistent memory that saves learned knowledge across sessions. The memory system supports three scopes:

src/tools/AgentTool/agentMemory.ts:13
TypeScript
13export type AgentMemoryScope = 'user' | 'project' | 'local'

Each scope has a different storage location:

src/tools/AgentTool/agentMemory.ts:52-65
TypeScript
52export function getAgentMemoryDir(
53 agentType: string, scope: AgentMemoryScope,
54): string {
55 const dirName = sanitizeAgentTypeForPath(agentType)
56 switch (scope) {
57 case 'project':
58 return join(getCwd(), '.claude', 'agent-memory', dirName) + sep
59 case 'local':
60 return getLocalAgentMemoryDir(dirName)
61 case 'user':
62 return join(getMemoryBaseDir(), 'agent-memory', dirName) + sep
63 }
64}
  • user: ~/.claude/agent-memory/{agentType}/ — cross-project general knowledge
  • project: .claude/agent-memory/{agentType}/ — project-specific knowledge (shareable via version control)
  • local: .claude/agent-memory-local/{agentType}/ — machine-specific knowledge (not version-controlled)

The memory entry file is always MEMORY.md:

src/tools/AgentTool/agentMemory.ts:109-114
TypeScript
109export function getAgentMemoryEntrypoint(
110 agentType: string, scope: AgentMemoryScope,
111): string {
112 return join(getAgentMemoryDir(agentType, scope), 'MEMORY.md')
113}

Transferable Patterns: Architecture Essentials for Building Multi-Agent Systems

From Claude Code's multi-agent implementation, we can distill several general-purpose architectural patterns applicable to building any multi-agent system.

Pattern 1: Role Separation and Tool Constraints

The Coordinator has only management tools; Workers have only execution tools. This hard separation prevents role confusion — the Coordinator won't be "tempted" to modify files directly, and Workers won't try to orchestrate other Workers.

The key to implementing this separation is tool set filtering: determine which tools an agent can use at spawn time, rather than relying on instructions in the system prompt. An LLM might not follow a "don't use tool X" instruction, but if the tool simply isn't in the available list, it physically cannot use it.

Pattern 2: Async Notifications Rather Than Synchronous Waiting

Worker results are returned via async notifications (<task-notification>) rather than blocking the Coordinator to wait. This allows the Coordinator to orchestrate multiple Workers simultaneously.

The notification format uses XML instead of JSON because XML tags are easier for LLMs to recognize during streaming processing. The <task-notification> opening tag provides a deterministic signal, preventing the LLM from confusing Worker results with user messages.

Pattern 3: Shared Lock-Free State

The Scratchpad directory provides cross-Worker state sharing without any locking mechanism. This works well in practice because the Coordinator typically ensures that Workers reading and writing to the same area don't run simultaneously.

This design is far simpler and less error-prone than introducing file locks — deadlocks are especially dangerous in multi-agent systems because LLMs don't have the ability to "detect and recover from deadlocks."

Pattern 4: Mailbox Communication Model

The mailbox communication model in Team mode — where the sender writes to the recipient's mailbox file — is a classic asynchronous messaging pattern. It completely decouples the execution timing of sender and receiver, naturally supporting offline messaging.

Pattern 5: Flat Membership Structure

The Team's member list is flat — only the lead can add members; teammates cannot spawn teammates. This prevents uncontrolled growth of organizational structure, simplifies the communication topology (degenerating from an arbitrary graph to a star), and reduces system complexity.

Conclusion

Claude Code's multi-agent system demonstrates a pragmatic approach to distributed AI system design. It doesn't pursue theoretical perfection — no distributed transactions, no consensus algorithms, no formal verification — but instead solves real problems with simple mechanisms:

  • Role separation is enforced through tool set filtering, not just prompt instructions
  • Task notifications use XML format injected as user-role messages, letting the LLM process them naturally
  • State sharing works through a Scratchpad directory in the filesystem — no locks, no protocols
  • Lifecycle management uses structured message protocols (shutdown_request/response) for graceful shutdown
  • MCP inheritance uses a merge-plus-independent-cleanup approach, letting child agents incrementally extend parent agent capabilities

The common thread across these design choices is that they all find the balance between "good enough" and "over-engineering." In the rapidly evolving field of AI agent systems, this pragmatic engineering philosophy may be more valuable than pursuing perfect architecture.