claude-harness — Deconstructing Claude Code

The Problem

What do you do when a single AI agent isn't enough? The intuitive answer is: spawn more agents. But the problem is far from that simple. How do multiple agents divide work? How do they communicate? How do they share context without conflicts? When a worker finishes a task, how does it report results to the coordinator? If a worker heads in the wrong direction, how do you terminate it promptly?

At the heart of these questions lies a classic distributed systems design challenge — except here the "nodes" aren't servers but LLM instances. Claude Code answers these questions with an elegant multi-agent architecture whose core pattern is called Coordinator/Worker.

This article dives deep into Claude Code's multi-agent system, starting from the overall architecture of the Coordinator pattern and analyzing layer by layer: AgentTool's agent spawning mechanism, the Worker's restricted tool set, the XML protocol for task notifications, SendMessageTool's cross-agent communication, Scratchpad directory persistent state sharing, background execution and progress tracking, Team swarm mode, and MCP Server inheritance and isolation across agents.

Overall Architecture of the Coordinator Pattern

Claude Code's multi-agent system is built on a clear role separation model: the Coordinator is responsible for understanding user intent, decomposing tasks, and synthesizing results; Workers are responsible for executing concrete work (research, implementation, verification). The Coordinator doesn't directly operate on files or run commands — it only has a minimal set of "management tools."

The Design Philosophy of Role Separation

The philosophy behind this design is simple: let the Coordinator focus on "thinking" and Workers focus on "doing." The Coordinator's tool set is strictly limited to four:

src/constants/tools.ts:107-112
TypeScript
107export const COORDINATOR_MODE_ALLOWED_TOOLS = new Set([
108  AGENT_TOOL_NAME,        // 'Agent' — spawn new workers
109  TASK_STOP_TOOL_NAME,    // 'TaskStop' — stop running workers
110  SEND_MESSAGE_TOOL_NAME, // 'SendMessage' — send messages to existing workers
111  SYNTHETIC_OUTPUT_TOOL_NAME, // 'SyntheticOutput' — internal output tool
112])

The Coordinator cannot read or write files, execute shell commands, or search code. It can only do three things: spawn workers, stop workers, and send messages to workers. This extreme constraint forces the Coordinator to be a pure "commander."

Enabling and Detecting Coordinator Mode

Coordinator mode is controlled via the CLAUDE_CODE_COORDINATOR_MODE environment variable:

src/coordinator/coordinatorMode.ts:36-41
TypeScript
36export function isCoordinatorMode(): boolean {
37  if (feature('COORDINATOR_MODE')) {
38    return isEnvTruthy(process.env.CLAUDE_CODE_COORDINATOR_MODE)
39  }
40  return false
41}

There's a double gate here: first, the COORDINATOR_MODE compile-time feature flag must be enabled (using Bun's feature() macro for dead code elimination), then the environment variable must be truthy. This ensures that in builds that don't support Coordinator mode, all related code is completely stripped out.

When resuming a session that was previously run in Coordinator mode, the system automatically matches the mode:

src/coordinator/coordinatorMode.ts:49-78
TypeScript
49export function matchSessionMode(
50  sessionMode: 'coordinator' | 'normal' | undefined,
51): string | undefined {
52  if (!sessionMode) {
53    return undefined
54  }
55  const currentIsCoordinator = isCoordinatorMode()
56  const sessionIsCoordinator = sessionMode === 'coordinator'
57
58  if (currentIsCoordinator === sessionIsCoordinator) {
59    return undefined
60  }
61
62  // Flip the env var — isCoordinatorMode() reads it live, no caching
63  if (sessionIsCoordinator) {
64    process.env.CLAUDE_CODE_COORDINATOR_MODE = '1'
65  } else {
66    delete process.env.CLAUDE_CODE_COORDINATOR_MODE
67  }
68
69  return sessionIsCoordinator
70    ? 'Entered coordinator mode to match resumed session.'
71    : 'Exited coordinator mode to match resumed session.'
72}

This code demonstrates an interesting design choice: mode state is stored in process.env rather than in a state object, because isCoordinatorMode() is called extensively, and reading the environment variable directly avoids introducing any additional state management layer.

Coordinator's System Prompt and Task Workflow

The Coordinator's system prompt defines a complete task workflow with four phases:

...

The concurrency rules defined in the system prompt are highly pragmatic:

src/coordinator/coordinatorMode.ts:213-219
TypeScript
213// Manage concurrency:
214// - **Read-only tasks** (research) — run in parallel freely
215// - **Write-heavy tasks** (implementation) — one at a time per set of files
216// - **Verification** can sometimes run alongside implementation on different file areas

This isn't enforced through code — it relies on the LLM itself understanding and following these rules. This is a notable architectural decision: between "hardcoded concurrency control" and "trusting the LLM's judgment," Claude Code chose the latter, because file conflict scenarios are too varied and complex for hardcoded rules to cover all cases.

AgentTool's Agent Spawning Mechanism

AgentTool is the entry point for the entire multi-agent system — all worker creation goes through it. Its implementation is in src/tools/AgentTool/AgentTool.tsx and is one of the most complex tools in the entire codebase (over 4,000 lines).

Input Schema and Mode Routing

AgentTool's input schema is dynamically composed based on compile-time feature flags:

src/tools/AgentTool/AgentTool.tsx:82-101
TypeScript
82const baseInputSchema = lazySchema(() => z.object({
description: z.string().describe('A short (3-5 word) description of the task'),
prompt: z.string().describe('The task for the agent to perform'),
subagent_type: z.string().optional()
  .describe('The type of specialized agent to use for this task'),
model: z.enum(['sonnet', 'opus', 'haiku']).optional()
  .describe("Optional model override for this agent."),
run_in_background: z.boolean().optional()
  .describe('Set to true to run this agent in the background.')
91}));
92
93const fullInputSchema = lazySchema(() => {
const multiAgentInputSchema = z.object({
  name: z.string().optional()
    .describe('Name for the spawned agent.'),
  team_name: z.string().optional()
    .describe('Team name for spawning.'),
  mode: permissionModeSchema().optional()
    .describe('Permission mode for spawned teammate.'),
});
return baseInputSchema().merge(multiAgentInputSchema).extend({
  isolation: z.enum(['worktree']).optional()
    .describe('Isolation mode. "worktree" creates a temporary git worktree.'),
  cwd: z.string().optional()
    .describe('Absolute path to run the agent in.')
});
108});

lazySchema() wrapping is used here to ensure Zod schemas are only instantiated on first use. Note that the isolation field's enum values differ based on build type — internal builds support 'worktree' | 'remote', while external builds only support 'worktree'.

Agent Type Resolution and Routing

When AgentTool.call() is invoked, it first needs to resolve the target agent's type. There are three paths:

...

The key routing logic is as follows:

src/tools/AgentTool/AgentTool.tsx:322-356
TypeScript
322const effectiveType = subagent_type
?? (isForkSubagentEnabled() ? undefined : GENERAL_PURPOSE_AGENT.agentType);
324const isForkPath = effectiveType === undefined;
325
326let selectedAgent: AgentDefinition;
327if (isForkPath) {
// Recursive fork guard: fork children cannot fork again
if (toolUseContext.options.querySource ===
    `agent:builtin:${FORK_AGENT.agentType}`
    || isInForkChild(toolUseContext.messages)) {
  throw new Error(
    'Fork is not available inside a forked worker.'
  );
}
selectedAgent = FORK_AGENT;
337} else {
const allAgents = toolUseContext.options.agentDefinitions.activeAgents;
const agents = filterDeniedAgents(allAgents,
  appState.toolPermissionContext, AGENT_TOOL_NAME);
const found = agents.find(agent => agent.agentType === effectiveType);
if (!found) {
  // Distinguish between "doesn't exist" and "denied by permission rule"
  const agentExistsButDenied = allAgents.find(
    agent => agent.agentType === effectiveType
  );
  if (agentExistsButDenied) {
    const denyRule = getDenyRuleForAgent(
      appState.toolPermissionContext, AGENT_TOOL_NAME, effectiveType
    );
    throw new Error(
      `Agent type '${effectiveType}' has been denied by permission rule.`
    );
  }
  throw new Error(`Agent type '${effectiveType}' not found.`);
}
selectedAgent = found;
358}

Several important design details here:

Recursive fork guard: Uses dual detection — querySource (compression-resistant since it's set at spawn time) and message scanning (fallback path), ensuring fork children don't recurse infinitely.
Permission filtering: Agent types can be denied by permission rules (configured via Agent(AgentName) syntax in settings), and error messages distinguish between "doesn't exist" and "denied."
Model override: In Coordinator mode, the model parameter is forcibly ignored (set to undefined), because workers need the default high-capability model to complete substantive tasks.

Worker Tool Set Assembly

The Worker's tool set isn't simply inherited from the Coordinator — it's assembled independently:

src/tools/AgentTool/AgentTool.tsx:573-577
TypeScript
573const workerPermissionContext = {
574  ...appState.toolPermissionContext,
575  mode: selectedAgent.permissionMode ?? 'acceptEdits'
576};
577const workerTools = assembleToolPool(
578  workerPermissionContext, appState.mcp.tools
579);

Workers get their own permission mode (defaulting to 'acceptEdits'), then independently assemble available tools from the global tool pool. This means the Worker's tool set is completely independent of the Coordinator's restrictions — even though the Coordinator only has 4 management tools, Workers can still use the full range of file operations, code search, and other tools.

Worktree Isolation

When isolation: 'worktree' is specified, AgentTool creates a temporary git worktree, letting the Worker operate on an isolated copy of the codebase:

src/tools/AgentTool/AgentTool.tsx:590-593
TypeScript
590if (effectiveIsolation === 'worktree') {
591  const slug = `agent-${earlyAgentId.slice(0, 8)}`;
592  worktreeInfo = await createAgentWorktree(slug);
593}

Worktree isolation provides two important benefits: Workers can freely modify code without affecting the main workspace; and multiple Workers can modify code in parallel across different worktrees. When a Worker finishes, if no changes were made in the worktree, it's automatically cleaned up; if changes exist, the worktree path and branch name are returned to the Coordinator.

The Worker's Restricted Tool Set

As asynchronously running sub-agents, Workers have a carefully designed restricted tool set. These restrictions are defined in the ASYNC_AGENT_ALLOWED_TOOLS set:

src/constants/tools.ts:55-71
TypeScript
55export const ASYNC_AGENT_ALLOWED_TOOLS = new Set([
FILE_READ_TOOL_NAME,      // Read files
WEB_SEARCH_TOOL_NAME,     // Web search
TODO_WRITE_TOOL_NAME,     // Write todos
GREP_TOOL_NAME,           // Content search
WEB_FETCH_TOOL_NAME,      // Fetch web pages
GLOB_TOOL_NAME,           // File pattern matching
...SHELL_TOOL_NAMES,      // Bash / PowerShell
FILE_EDIT_TOOL_NAME,      // Edit files
FILE_WRITE_TOOL_NAME,     // Write files
NOTEBOOK_EDIT_TOOL_NAME,  // Edit notebooks
SKILL_TOOL_NAME,          // Skill invocation
SYNTHETIC_OUTPUT_TOOL_NAME,
TOOL_SEARCH_TOOL_NAME,    // Tool search
ENTER_WORKTREE_TOOL_NAME, // Enter worktree
EXIT_WORKTREE_TOOL_NAME,  // Exit worktree
71])

Explicitly excluded tools include:

src/constants/tools.ts:36-46
TypeScript
36export const ALL_AGENT_DISALLOWED_TOOLS = new Set([
37  TASK_OUTPUT_TOOL_NAME,      // Prevent recursion
38  EXIT_PLAN_MODE_V2_TOOL_NAME, // Plan mode is a main thread abstraction
39  ENTER_PLAN_MODE_TOOL_NAME,
40  AGENT_TOOL_NAME,            // Prevent agent recursive spawning (except ant users)
41  ASK_USER_QUESTION_TOOL_NAME, // Async workers cannot ask users questions
42  TASK_STOP_TOOL_NAME,        // Requires main thread task state
43  WORKFLOW_TOOL_NAME,         // Prevent workflow recursion
44])

The actual tool filtering logic is implemented in filterToolsForAgent:

src/tools/AgentTool/agentToolUtils.ts:70-116
TypeScript
70export function filterToolsForAgent({
tools, isBuiltIn, isAsync = false, permissionMode,
72}: { tools: Tools; isBuiltIn: boolean; isAsync?: boolean;
   permissionMode?: PermissionMode }): Tools {
return tools.filter(tool => {
  // MCP tools are always allowed
  if (tool.name.startsWith('mcp__')) {
    return true
  }
  // Allow ExitPlanMode in plan mode
  if (toolMatchesName(tool, EXIT_PLAN_MODE_V2_TOOL_NAME)
      && permissionMode === 'plan') {
    return true
  }
  // Global disallow list
  if (ALL_AGENT_DISALLOWED_TOOLS.has(tool.name)) {
    return false
  }
  // Additional disallow list for custom agents
  if (!isBuiltIn && CUSTOM_AGENT_DISALLOWED_TOOLS.has(tool.name)) {
    return false
  }
  // Async agent allowlist filtering
  if (isAsync && !ASYNC_AGENT_ALLOWED_TOOLS.has(tool.name)) {
    // Special case: in-process teammates can use AgentTool and task tools
    if (isAgentSwarmsEnabled() && isInProcessTeammate()) {
      if (toolMatchesName(tool, AGENT_TOOL_NAME)) {
        return true
      }
      if (IN_PROCESS_TEAMMATE_ALLOWED_TOOLS.has(tool.name)) {
        return true
      }
    }
    return false
  }
  return true
})
107}

There's a particularly interesting layered design here. For in-process teammates (members in Team mode), an additional set of task management tools is allowed:

src/constants/tools.ts:77-88
TypeScript
77export const IN_PROCESS_TEAMMATE_ALLOWED_TOOLS = new Set([
78  TASK_CREATE_TOOL_NAME,
79  TASK_GET_TOOL_NAME,
80  TASK_LIST_TOOL_NAME,
81  TASK_UPDATE_TOOL_NAME,
82  SEND_MESSAGE_TOOL_NAME,
83])

This enables teammates within a Team to create tasks, update task status, and send messages to other teammates — all capabilities essential for swarm collaboration.

Task Notification Mechanism

When a Worker completes a task, it doesn't return results via a function call. Instead, it sends results to the Coordinator through a carefully designed XML-format notification injected as a user-role message into the Coordinator's conversation.

Notification Format

XML
1<task-notification>
<task-id>{agentId}</task-id>
<status>completed|failed|killed</status>
<summary>{human-readable status summary}</summary>
<result>{agent's final text response}</result>
<usage>
  <total_tokens>N</total_tokens>
  <tool_uses>N</tool_uses>
  <duration_ms>N</duration_ms>
</usage>
11</task-notification>

Why XML instead of JSON? Because the <task-notification> opening tag provides a clear, easily recognizable signal for the LLM — the Coordinator's system prompt explicitly states "distinguish notifications from user messages by the <task-notification> opening tag." XML's tag structure is easier for LLMs to recognize and parse during streaming generation than JSON's curly braces.

Notification Injection Path

Notifications are injected into the Coordinator's message stream via enqueueAgentNotification. The entire async agent lifecycle management is in the runAsyncAgentLifecycle function:

sequenceDiagram
    participant Coord as Coordinator
    participant AT as AgentTool
    participant Worker as Worker Agent
    participant Task as Task Registry

    Coord->>AT: Agent({ prompt: "...", run_in_background: true })
    AT->>Task: registerAsyncAgent(agentId)
    AT->>Worker: runAgent() (launch in background)
    AT-->>Coord: { status: "async_launched", agentId: "..." }

    Note over Worker: Worker executes task...
    Worker->>Worker: Uses tools (Read/Edit/Bash...)

    alt Completed successfully
        Worker->>Task: completeAsyncAgent(result)
        Task->>Coord: enqueueAgentNotification(XML)
    else Execution failed
        Worker->>Task: failAsyncAgent(error)
        Task->>Coord: enqueueAgentNotification(XML)
    else Killed
        Coord->>Task: killAsyncAgent(agentId)
        Task->>Coord: enqueueAgentNotification(XML)
    end

While Workers run in the background, the Coordinator can continue interacting with the user or launch more Workers. Notifications arrive as user-role messages, and the Coordinator processes them at the start of its next turn.

One-Shot Optimization

For certain built-in agents (like Explore and Plan) that only run once and won't be continued by the Coordinator via SendMessage, the notification omits the agentId and SendMessage usage instructions to save tokens:

src/tools/AgentTool/constants.ts:9-12
TypeScript
9export const ONE_SHOT_BUILTIN_AGENT_TYPES: ReadonlySet<string> = new Set([
10  'Explore',
11  'Plan',
12])

Comments note that this optimization "saves ~135 chars x 34M Explore runs/week" — at scale, every token matters.

SendMessageTool: Cross-Agent Communication

SendMessageTool is the core tool for inter-agent communication. It's used not only for the Coordinator to send follow-up instructions to Workers but also for direct communication between teammates in Team mode.

Message Routing Logic

SendMessageTool's message routing is highly refined, handling multiple target types:

src/tools/SendMessageTool/SendMessageTool.ts:67-87
TypeScript
67const inputSchema = lazySchema(() =>
z.object({
  to: z.string().describe(
    'Recipient: teammate name, or "*" for broadcast to all teammates'
  ),
  summary: z.string().optional().describe(
    'A 5-10 word summary shown as a preview in the UI'
  ),
  message: z.union([
    z.string().describe('Plain text message content'),
    StructuredMessage(),
  ]),
}),
80)

The to field can be:

A teammate name (e.g., "researcher") — sends to a specific teammate
"*" — broadcasts to all teammates
"uds:/path/to.sock" — cross-process communication (via Unix Domain Socket)
"bridge:session_..." — cross-machine communication (via Remote Control)

Sending Messages to Existing Workers

When the Coordinator uses SendMessage to send a message to a completed or running Worker, the handling logic has three branches:

src/tools/SendMessageTool/SendMessageTool.ts:802-873
TypeScript
802if (typeof input.message === 'string' && input.to !== '*') {
const appState = context.getAppState()
const registered = appState.agentNameRegistry.get(input.to)
const agentId = registered ?? toAgentId(input.to)
806
if (agentId) {
  const task = appState.tasks[agentId]
  if (isLocalAgentTask(task) && !isMainSessionTask(task)) {
    if (task.status === 'running') {
      // Worker still running: queue message for delivery on next tool turn
      queuePendingMessage(agentId, input.message, ...)
      return { data: {
        success: true,
        message: `Message queued for delivery to ${input.to}.`
      }}
    }
    // Worker has stopped: auto-resume
    const result = await resumeAgentBackground({
      agentId, prompt: input.message, ...
    })
    return { data: {
      success: true,
      message: `Agent "${input.to}" was stopped; resumed it in background.`
    }}
  }
}
828}

Two important behaviors here:

Running Worker: The message is queued (queuePendingMessage) and delivered during the Worker's next tool call turn. This avoids interrupting the Worker's current work.
Stopped Worker: The Worker is automatically resumed (resumeAgentBackground), loading its previous conversation context from the on-disk transcript, then continuing execution with the new message as a continuation prompt. This allows the Coordinator to repeatedly leverage an existing Worker's accumulated context.

Structured Message Protocol

Beyond plain text messages, SendMessageTool also supports structured messages for coordination operations in Team mode:

src/tools/SendMessageTool/SendMessageTool.ts:46-65
TypeScript
46const StructuredMessage = lazySchema(() =>
z.discriminatedUnion('type', [
  z.object({
    type: z.literal('shutdown_request'),
    reason: z.string().optional(),
  }),
  z.object({
    type: z.literal('shutdown_response'),
    request_id: z.string(),
    approve: semanticBoolean(),
    reason: z.string().optional(),
  }),
  z.object({
    type: z.literal('plan_approval_response'),
    request_id: z.string(),
    approve: semanticBoolean(),
    feedback: z.string().optional(),
  }),
]),
65)

These structured messages implement three coordination protocols:

Shutdown protocol: The Team lead sends a shutdown_request to a teammate, who replies with a shutdown_response (approve or deny). An approved shutdown triggers the teammate process's gracefulShutdown.
Plan approval protocol: In plan permission mode, teammates need the Team lead's approval before executing implementation.
Broadcast: to: "*" broadcasts the message to all teammates, iterating through all members in the team file (excluding the sender).

Mailbox Communication Model

Message passing in Team mode is based on a mailbox model — messages are written to the recipient's mailbox file rather than pushed directly:

src/tools/SendMessageTool/SendMessageTool.ts:161-170
TypeScript
161await writeToMailbox(
162  recipientName,
163  {
164    from: senderName,
165    text: content,
166    summary,
167    timestamp: new Date().toISOString(),
168    color: senderColor,
169  },
170  teamName,
171)

The benefit of this design is complete decoupling of sender and receiver — the sender doesn't need to wait for the receiver to be online; messages are delivered the next time the receiver polls its mailbox.

Scratchpad Directory: Persistent State Sharing Across Workers

How do multiple Workers share information? Claude Code provides a mechanism called the Scratchpad — a session-level temporary directory that all Workers can freely read from and write to without permission prompts.

Scratchpad Location and Permissions

src/utils/permissions/filesystem.ts:384-386
TypeScript
384export function getScratchpadDir(): string {
385  return join(getProjectTempDir(), getSessionId(), 'scratchpad')
386}

The path format is /tmp/claude-{uid}/{sanitized-cwd}/{sessionId}/scratchpad/. The directory is created with 0o700 permissions (owner-only access) to ensure security.

How the Coordinator Informs Workers About the Scratchpad

Scratchpad directory information is injected into the Coordinator's context via user context:

src/coordinator/coordinatorMode.ts:80-108
TypeScript
80export function getCoordinatorUserContext(
81  mcpClients: ReadonlyArray<{ name: string }>,
82  scratchpadDir?: string,
83): { [k: string]: string } {
84  if (!isCoordinatorMode()) {
85    return {}
86  }
87
88  let content = `Workers spawned via the ${AGENT_TOOL_NAME} tool have ` +
89    `access to these tools: ${workerTools}`
90
91  if (mcpClients.length > 0) {
92    const serverNames = mcpClients.map(c => c.name).join(', ')
93    content += `\n\nWorkers also have access to MCP tools from ` +
94      `connected MCP servers: ${serverNames}`
95  }
96
97  if (scratchpadDir && isScratchpadGateEnabled()) {
98    content += `\n\nScratchpad directory: ${scratchpadDir}\n` +
99      `Workers can read and write here without permission prompts. ` +
100      `Use this for durable cross-worker knowledge — ` +
101      `structure files however fits the work.`
102  }
103
104  return { workerToolsContext: content }
105}

Note the key phrase in the prompt: "structure files however fits the work" — the system doesn't prescribe file structure within the Scratchpad, letting the Coordinator and Workers organize it as the task demands. This flexibility is intentional.

Security: Path Traversal Protection

Scratchpad path detection includes path traversal protection:

src/utils/permissions/filesystem.ts:410-423
TypeScript
410function isScratchpadPath(absolutePath: string): boolean {
if (!isScratchpadEnabled()) {
  return false
}
const scratchpadDir = getScratchpadDir()
// SECURITY: Normalize the path to resolve .. segments before checking
const normalizedPath = normalize(absolutePath)
return (
  normalizedPath === scratchpadDir ||
  normalizedPath.startsWith(scratchpadDir + sep)
)
421}

The comment explicitly warns about the attack vector: without normalization, a path like /tmp/claude-0/proj/session/scratchpad/../../../etc/passwd would pass the startsWith check but actually write to /etc/passwd. The normalize() call resolves .. segments, closing this vulnerability.

Background Execution and Progress Tracking

Synchronous vs. Asynchronous Execution

AgentTool supports two execution modes: synchronous (foreground) and asynchronous (background). The logic for determining which mode to use combines multiple signals:

src/tools/AgentTool/AgentTool.tsx:557-567
TypeScript
557const shouldRunAsync = (
558  run_in_background === true ||        // Explicitly requested background
559  selectedAgent.background === true ||   // Agent definition requires background
560  isCoordinator ||                       // All async in Coordinator mode
561  forceAsync ||                          // All async in Fork experiment mode
562  assistantForceAsync ||                 // Force async in Assistant mode
563  (proactiveModule?.isProactiveActive() ?? false)  // Proactive mode
564) && !isBackgroundTasksDisabled;         // Global disable switch

In Coordinator mode, all agents run asynchronously. This is because the Coordinator's core value lies in parallel orchestration — if Workers ran synchronously, the Coordinator couldn't launch multiple Workers simultaneously.

Auto-Backgrounding

There's also an auto-backgrounding mechanism — when a Worker runs for more than a certain time (120 seconds), it's automatically moved to the background:

src/tools/AgentTool/AgentTool.tsx:72-77
TypeScript
72function getAutoBackgroundMs(): number {
73  if (isEnvTruthy(process.env.CLAUDE_AUTO_BACKGROUND_TASKS)
74      || getFeatureValue_CACHED_MAY_BE_STALE(
75           'tengu_auto_background_agents', false)) {
76    return 120_000;
77  }
78  return 0;
79}

Agent Resume Mechanism

When a stopped agent needs to be resumed, the resumeAgentBackground function is responsible for rebuilding conversation context from the on-disk transcript:

src/tools/AgentTool/resumeAgent.ts:42-60
TypeScript
42export async function resumeAgentBackground({
agentId,
prompt,
toolUseContext,
canUseTool,
invokingRequestId,
48}: {
agentId: string
prompt: string
toolUseContext: ToolUseContext
canUseTool: CanUseToolFn
invokingRequestId?: string
54}): Promise<ResumeAgentResult> {
const startTime = Date.now()
const appState = toolUseContext.getAppState()
const rootSetAppState =
  toolUseContext.setAppStateForTasks ?? toolUseContext.setAppState
// ...
60}

The resume process reads the agent's previous transcript (including all tool calls and results), rebuilds the message history, then adds the new prompt as a user message at the end. This gives the resumed agent complete context from its previous execution.

runAgent: The Worker's Execution Engine

runAgent is the Worker's core execution function. It's an async generator responsible for initializing MCP servers, building context, and running the query loop.

MCP Server Inheritance and Isolation

Agent definitions can declare their own MCP servers, which are incremental extensions of the parent context's MCP clients:

src/tools/AgentTool/runAgent.ts:95-110
TypeScript
95async function initializeAgentMcpServers(
96  agentDefinition: AgentDefinition,
97  parentClients: MCPServerConnection[],
98): Promise<{
99  clients: MCPServerConnection[]
100  tools: Tools
101  cleanup: () => Promise<void>
102}> {
103  if (!agentDefinition.mcpServers?.length) {
104    return {
105      clients: parentClients,  // No custom MCP: directly inherit parent clients
106      tools: [],
107      cleanup: async () => {},
108    }
109  }
110  // ...
111}

MCP servers can be referenced in two ways:

String reference: References a configured MCP server by name, using the memoized connectToServer to share the connection.
Inline definition: A { [name]: config } format for a brand new MCP server configuration, requiring cleanup when the agent finishes.

src/tools/AgentTool/runAgent.ts:135-175
TypeScript
135for (const spec of agentDefinition.mcpServers) {
if (typeof spec === 'string') {
  // Reference by name — use memoized connectToServer to share connection
  name = spec
  config = getMcpConfigByName(spec)
} else {
  // Inline definition — agent-exclusive, needs cleanup on exit
  const [serverName, serverConfig] = Object.entries(spec)[0]!
  name = serverName
  config = { ...serverConfig, scope: 'dynamic' }
  isNewlyCreated = true
}
147
const client = await connectToServer(name, config)
agentClients.push(client)
if (isNewlyCreated) {
  newlyCreatedClients.push(client)
}
153}

A key security constraint: when MCP is locked to plugin-only mode, user-controlled agents' frontmatter MCP servers are skipped, but plugin, built-in, and policySettings agents' MCP are unaffected since they come from admin-trusted sources:

src/tools/AgentTool/runAgent.ts:117-127
TypeScript
117const agentIsAdminTrusted = isSourceAdminTrusted(agentDefinition.source)
118if (isRestrictedToPluginOnly('mcp') && !agentIsAdminTrusted) {
119  logForDebugging(
120    `[Agent: ${agentDefinition.agentType}] Skipping MCP servers: ` +
121    `strictPluginOnlyCustomization locks MCP to plugin-only`
122  )
123  return { clients: parentClients, tools: [], cleanup: async () => {} }
124}

The cleanup function only cleans up newly created clients; shared clients are managed by the parent context:

src/tools/AgentTool/runAgent.ts:197-210
TypeScript
197const cleanup = async () => {
198  for (const client of newlyCreatedClients) {
199    if (client.type === 'connected') {
200      try {
201        await client.cleanup()
202      } catch (error) {
203        logForDebugging(
204          `Error cleaning up MCP server '${client.name}': ${error}`
205        )
206      }
207    }
208  }
209}
210
211return {
212  clients: [...parentClients, ...agentClients],  // Merge parent + agent-specific
213  tools: agentTools,
214  cleanup,
215}

Agent Definitions and Tool Control

Each agent's capabilities are controlled by its AgentDefinition. Taking the built-in general-purpose agent as an example:

src/tools/AgentTool/built-in/generalPurposeAgent.ts:25-34
TypeScript
25export const GENERAL_PURPOSE_AGENT: BuiltInAgentDefinition = {
26  agentType: 'general-purpose',
27  whenToUse: 'General-purpose agent for researching complex questions...',
28  tools: ['*'],          // Use all available tools
29  source: 'built-in',
30  baseDir: 'built-in',
31  // model intentionally omitted — uses getDefaultSubagentModel()
32  getSystemPrompt: getGeneralPurposeSystemPrompt,
33}

tools: ['*'] means using all available tools (after filtering). Custom agents can specify explicit tool lists or disallowed lists. The resolveAgentTools function handles this complex tool resolution logic:

src/tools/AgentTool/agentToolUtils.ts:122-173
TypeScript
122export function resolveAgentTools(
agentDefinition, availableTools, isAsync = false, isMainThread = false,
124): ResolvedAgentTools {
const filteredAvailableTools = isMainThread
  ? availableTools
  : filterToolsForAgent({
      tools: availableTools,
      isBuiltIn: source === 'built-in',
      isAsync,
      permissionMode,
    })
133
// Create disallowed tool set
const disallowedToolSet = new Set(
  disallowedTools?.map(toolSpec => {
    const { toolName } = permissionRuleValueFromString(toolSpec)
    return toolName
  }) ?? [],
)
141
// Filter
const allowedAvailableTools = filteredAvailableTools.filter(
  tool => !disallowedToolSet.has(tool.name),
)
146
// Wildcard handling
const hasWildcard = agentTools === undefined
  || (agentTools.length === 1 && agentTools[0] === '*')
if (hasWildcard) {
  return {
    hasWildcard: true,
    validTools: [],
    invalidTools: [],
    resolvedTools: allowedAvailableTools,
  }
}
// ...
159}

Team System: Swarm Mode

Beyond the Coordinator/Worker pattern, Claude Code also supports a looser form of multi-agent collaboration — Team (Swarm) mode. In this mode, multiple agents work as "teammates" in parallel, collaborating through a shared task list and messaging system.

TeamCreateTool: Creating a Team

src/tools/TeamCreateTool/TeamCreateTool.ts:37-49
TypeScript
37const inputSchema = lazySchema(() =>
38  z.strictObject({
39    team_name: z.string()
40      .describe('Name for the new team to create.'),
41    description: z.string().optional()
42      .describe('Team description/purpose.'),
43    agent_type: z.string().optional()
44      .describe('Type/role of the team lead.'),
45  }),
46)

Creating a Team does the following:

...

Teams and Task Lists have a 1:1 correspondence — each Team has its own task list directory, with task numbers starting from 1:

src/tools/TeamCreateTool/TeamCreateTool.ts:182-191
TypeScript
182const taskListId = sanitizeName(finalTeamName)
183await resetTaskList(taskListId)
184await ensureTasksDir(taskListId)
185
186// Register team name so getTaskListId() returns it
187setLeaderTeamName(sanitizeName(finalTeamName))

TeamFile Structure

src/tools/TeamCreateTool/TeamCreateTool.ts:157-175
TypeScript
157const teamFile: TeamFile = {
name: finalTeamName,
description: _description,
createdAt: Date.now(),
leadAgentId,
leadSessionId: getSessionId(),
members: [
  {
    agentId: leadAgentId,
    name: TEAM_LEAD_NAME,  // 'team-lead'
    agentType: leadAgentType,
    model: leadModel,
    joinedAt: Date.now(),
    tmuxPaneId: '',
    cwd: getCwd(),
    subscriptions: [],
  },
],
175}

The Team lead's ID is deterministic — generated by formatAgentId(TEAM_LEAD_NAME, finalTeamName) rather than a random UUID. This allows other teammates to derive the Team lead's ID without querying any registry.

Spawning Teammates

In Team mode, teammates are spawned by passing team_name and name parameters through AgentTool. This triggers the spawnTeammate() path:

src/tools/AgentTool/AgentTool.tsx:284-316
TypeScript
284if (teamName && name) {
const result = await spawnTeammate({
  name,
  prompt,
  description,
  team_name: teamName,
  use_splitpane: true,
  plan_mode_required: spawnMode === 'plan',
  model: model ?? agentDef?.model,
  agent_type: subagent_type,
  invokingRequestId: assistantMessage?.requestId
}, toolUseContext);
296
const spawnResult: TeammateSpawnedOutput = {
  status: 'teammate_spawned' as const,
  prompt,
  ...result.data
};
// ...
303}

Note an important constraint — teammates cannot spawn teammates:

src/tools/AgentTool/AgentTool.tsx:272-274
TypeScript
272if (isTeammate() && teamName && name) {
273  throw new Error(
274    'Teammates cannot spawn other teammates — the team roster is flat.'
275  );
276}

The Team's member list is flat — only the Team lead can add members. This prevents unbounded nesting of teammate relationships, simplifying communication and lifecycle management.

TeamDeleteTool: Cleaning Up a Team

When a Team is finished, TeamDeleteTool handles cleaning up all resources:

src/tools/TeamDeleteTool/TeamDeleteTool.ts:71-135
TypeScript
71async call(_input, context) {
const appState = getAppState()
const teamName = appState.teamContext?.teamName
74
if (teamName) {
  const teamFile = readTeamFile(teamName)
  if (teamFile) {
    // Only check truly active members (filter out idle/dead)
    const nonLeadMembers = teamFile.members.filter(
      m => m.name !== TEAM_LEAD_NAME
    )
    const activeMembers = nonLeadMembers.filter(
      m => m.isActive !== false
    )
    if (activeMembers.length > 0) {
      throw new Error(
        `Cannot cleanup team with ${activeMembers.length} active member(s).`
      )
    }
  }
  await cleanupTeamDirectories(teamName)
  unregisterTeamForSessionCleanup(teamName)
  clearTeammateColors()
  clearLeaderTeamName()
}
96
// Clear team context and inbox from AppState
setAppState(prev => ({
  ...prev,
  teamContext: undefined,
  inbox: { messages: [] },
}))
103}

An important safety check: you can't delete a Team while it still has active members. All teammates must first be gracefully terminated via the SendMessage shutdown_request protocol.

Team Workflow

The complete Team workflow as described in the system prompt:

...

Fork Sub-Agent: Context Inheritance

Beyond Coordinator/Worker and Team swarm, there's a third multi-agent pattern — Fork. A Fork sub-agent inherits the parent agent's complete conversation context (including the system prompt and all history messages), making it suitable for tasks that don't need intermediate tool outputs retained in the parent context.

src/tools/AgentTool/forkSubagent.ts:32-39
TypeScript
32export function isForkSubagentEnabled(): boolean {
33  if (feature('FORK_SUBAGENT')) {
34    if (isCoordinatorMode()) return false     // Mutually exclusive with Coordinator mode
35    if (getIsNonInteractiveSession()) return false  // Not supported in non-interactive sessions
36    return true
37  }
38  return false
39}

Fork and Coordinator mode are mutually exclusive — because the Coordinator already has its own orchestration model. Fork's advantages include:

Cache-friendly: Fork sub-agents use the parent agent's exact system prompt and tool set (useExactTools: true), so the API request prefix is identical to the parent's, enabling prompt cache reuse.
Context inheritance: No need to re-explain background in the prompt — the sub-agent already "knows" everything.
Imperative prompts: Since context is inherited, the prompt only needs to be a "what to do" instruction, not a complete "here's the situation + what to do" description.

Agent Persistent Memory

Worker agents can have persistent memory that saves learned knowledge across sessions. The memory system supports three scopes:

src/tools/AgentTool/agentMemory.ts:13
TypeScript
13export type AgentMemoryScope = 'user' | 'project' | 'local'

Each scope has a different storage location:

src/tools/AgentTool/agentMemory.ts:52-65
TypeScript
52export function getAgentMemoryDir(
53  agentType: string, scope: AgentMemoryScope,
54): string {
55  const dirName = sanitizeAgentTypeForPath(agentType)
56  switch (scope) {
57    case 'project':
58      return join(getCwd(), '.claude', 'agent-memory', dirName) + sep
59    case 'local':
60      return getLocalAgentMemoryDir(dirName)
61    case 'user':
62      return join(getMemoryBaseDir(), 'agent-memory', dirName) + sep
63  }
64}

user: ~/.claude/agent-memory/{agentType}/ — cross-project general knowledge
project: .claude/agent-memory/{agentType}/ — project-specific knowledge (shareable via version control)
local: .claude/agent-memory-local/{agentType}/ — machine-specific knowledge (not version-controlled)

The memory entry file is always MEMORY.md:

src/tools/AgentTool/agentMemory.ts:109-114
TypeScript
109export function getAgentMemoryEntrypoint(
110  agentType: string, scope: AgentMemoryScope,
111): string {
112  return join(getAgentMemoryDir(agentType, scope), 'MEMORY.md')
113}

Transferable Patterns: Architecture Essentials for Building Multi-Agent Systems

From Claude Code's multi-agent implementation, we can distill several general-purpose architectural patterns applicable to building any multi-agent system.

Pattern 1: Role Separation and Tool Constraints

The Coordinator has only management tools; Workers have only execution tools. This hard separation prevents role confusion — the Coordinator won't be "tempted" to modify files directly, and Workers won't try to orchestrate other Workers.

The key to implementing this separation is tool set filtering: determine which tools an agent can use at spawn time, rather than relying on instructions in the system prompt. An LLM might not follow a "don't use tool X" instruction, but if the tool simply isn't in the available list, it physically cannot use it.

Pattern 2: Async Notifications Rather Than Synchronous Waiting

Worker results are returned via async notifications (<task-notification>) rather than blocking the Coordinator to wait. This allows the Coordinator to orchestrate multiple Workers simultaneously.

The notification format uses XML instead of JSON because XML tags are easier for LLMs to recognize during streaming processing. The <task-notification> opening tag provides a deterministic signal, preventing the LLM from confusing Worker results with user messages.

Pattern 3: Shared Lock-Free State

The Scratchpad directory provides cross-Worker state sharing without any locking mechanism. This works well in practice because the Coordinator typically ensures that Workers reading and writing to the same area don't run simultaneously.

This design is far simpler and less error-prone than introducing file locks — deadlocks are especially dangerous in multi-agent systems because LLMs don't have the ability to "detect and recover from deadlocks."

Pattern 4: Mailbox Communication Model

The mailbox communication model in Team mode — where the sender writes to the recipient's mailbox file — is a classic asynchronous messaging pattern. It completely decouples the execution timing of sender and receiver, naturally supporting offline messaging.

Pattern 5: Flat Membership Structure

The Team's member list is flat — only the lead can add members; teammates cannot spawn teammates. This prevents uncontrolled growth of organizational structure, simplifies the communication topology (degenerating from an arbitrary graph to a star), and reduces system complexity.

Conclusion

Claude Code's multi-agent system demonstrates a pragmatic approach to distributed AI system design. It doesn't pursue theoretical perfection — no distributed transactions, no consensus algorithms, no formal verification — but instead solves real problems with simple mechanisms:

Role separation is enforced through tool set filtering, not just prompt instructions
Task notifications use XML format injected as user-role messages, letting the LLM process them naturally
State sharing works through a Scratchpad directory in the filesystem — no locks, no protocols
Lifecycle management uses structured message protocols (shutdown_request/response) for graceful shutdown
MCP inheritance uses a merge-plus-independent-cleanup approach, letting child agents incrementally extend parent agent capabilities

The common thread across these design choices is that they all find the balance between "good enough" and "over-engineering." In the rapidly evolving field of AI agent systems, this pragmatic engineering philosophy may be more valuable than pursuing perfect architecture.