The Streaming Tool Executor: How to Safely Let AI Operate Multiple Tools Simultaneously
Deep dive into StreamingToolExecutor's concurrency model — isConcurrencySafe declarations, queue scheduling, Sibling Abort cascading cancellation, progress buffering and ordered emission
The Problem
Imagine this scenario: you ask Claude Code to refactor a module. The model returns 5 tool_use calls in a single response — 3 file reads, 1 Bash command execution, and 1 file write. Now the questions arise:
- Should these 5 tools run serially or in parallel?
- If the Bash command fails, should the file reads running in parallel be cancelled?
- The file write depends on the Bash result — should it wait for Bash to complete before executing?
- The user presses ESC during tool execution — which tools should stop, and which should continue?
- Multiple tools produce progress messages simultaneously — how should the UI display them in order?
These questions seem simple, but each one involves core challenges of concurrency control. Serial execution is too slow — users don't want to wait for 3 independent file reads to complete one after another. Full parallelism is too dangerous — a write operation and a read operation accessing the same file simultaneously could cause a data race.
Claude Code's solution is StreamingToolExecutor — a carefully designed concurrency orchestrator that lets each tool declare whether it can run in parallel, then dynamically schedules execution based on those declarations. This article will dissect every design decision in detail.
Why a Streaming Tool Executor?
In the previous article, we covered the overall architecture of the tool system. But one key question was intentionally deferred to this article: when the model returns multiple tool calls in a single streaming response, how does the executor manage their lifecycles?
Traditional approaches fall into two extremes:
Approach A: Fully Serial
Safe but extremely slow. Each tool waits for the previous one to finish before starting. For 3 independent file reads, this means 3x the wait time.
Approach B: Fully Parallel
Fast but dangerous. If Tool 1 is rm -rf build/ and Tool 2 is cat build/output.js, the result of parallel execution is unpredictable.
Approach C: Claude Code's Hybrid Scheduling
Reads run in parallel, writes get exclusive access. Safe and efficient.
This is the core problem StreamingToolExecutor solves.
Architecture Overview
StreamingToolExecutor lives in src/services/tools/StreamingToolExecutor.ts and is a class of roughly 530 lines. Its responsibilities are:
- Receive tool calls — accept
tool_useblocks one by one as the streaming response arrives - Determine scheduling strategy — based on each tool's concurrency safety declaration, decide whether to execute immediately or queue
- Manage lifecycles — track each tool from queuing to completion
- Handle error cascading — one tool's failure may require cancelling its sibling tools
- Emit results in order — progress messages are sent immediately, final results are emitted in sequence
Here is the overall architecture diagram:
TrackedTool: The Complete Lifecycle of a Tool
Every tool call that enters the executor is wrapped in a TrackedTool object. This structure is defined at lines 21-32 of StreamingToolExecutor.ts:
Four Lifecycle States
ToolStatus is a four-value enum, and each tool flows strictly through queued -> executing -> completed -> yielded:
queued (waiting): The tool was just added by addTool() and hasn't started executing yet. There may be other non-concurrency-safe tools currently running exclusively, so it must wait.
executing (running): The tool has started execution. Its promise field holds the execution Promise, and progress messages are collected in real time via the pendingProgress array.
completed (finished): Tool execution has ended (success, failure, or cancellation), and results are stored in the results field but haven't been emitted to the caller yet. This is the key to ordered emission — even if Tool 3 finishes first, it waits for Tool 1 and Tool 2's results to be emitted first.
yielded (emitted): Results have been emitted to the caller via getCompletedResults(), and this tool's lifecycle is completely over.
Key Field Analysis
pendingProgress is a field worth special attention. Progress messages (like real-time output from a Bash command) need to be shown to the user immediately and can't wait until the tool completes. So progress messages and final results are stored separately — progress messages can be emitted at any time, while final results must be emitted in order.
contextModifiers stores the tool's modifications to the execution context. For example, a tool might need to update file history state. But note an important restriction in the code (lines 391-395):
Only non-concurrency-safe tools can modify the context. This is a deliberate design constraint — concurrent tools modifying shared context would introduce race conditions, so it's simply prohibited.
isConcurrencySafe: Tools Decide for Themselves Whether They Can Run in Parallel
The most fundamental design principle of StreamingToolExecutor is that tools declare their own concurrency safety. Not guessed by the scheduler, not defined in a global configuration table, but implemented by each tool in its isConcurrencySafe() method.
This method is defined at line 402 of src/Tool.ts:
Note that it accepts an input parameter — this means the same tool may have different concurrency safety depending on the input.
Concurrency Safety Declarations Across Tools
Let's look at how various tools actually declare themselves in the code:
FileReadTool (file reading) — always concurrency-safe:
File reading is a purely read-only operation; multiple reads running simultaneously produce no side effects.
GrepTool (search) — always concurrency-safe:
Search operations are likewise read-only, naturally supporting parallelism.
AgentTool (sub-agent) — always concurrency-safe:
The sub-agent tool declares itself as concurrency-safe because each sub-agent runs in its own isolated context.
BashTool (command execution) — depends on input:
This is the most interesting case. The Bash tool's concurrency safety depends on whether the command itself is read-only. ls, cat, grep are read-only and can run in parallel; rm, mv, git commit have side effects and must run exclusively.
Default behavior — assume unsafe (line 759):
Tools built through buildTool() that don't explicitly declare isConcurrencySafe default to returning false. This is a conservatively safe design — better to sacrifice performance than risk concurrency issues.
Safety Calculation in addTool
When a tool is added to the executor, the isConcurrencySafe calculation process is worth careful examination. See lines 104-121 of StreamingToolExecutor.ts:
There are three layers of defense here:
- Input validation: First validate input using the Zod schema. If the input format is invalid, it's immediately marked as non-concurrency-safe.
- try-catch wrapper: Even if the input is valid,
isConcurrencySafe()itself might throw an exception (e.g., a bug in the tool definition). Any exception falls back tofalse. - Boolean coercion: The result is wrapped in
Boolean()to prevent tools from accidentally returning truthy values (like non-empty strings).
This "defense in depth" pattern is ubiquitous in Claude Code — on code paths related to concurrency and safety, always assume the worst case.
canExecuteTool: The Core Scheduling Decision
Given each tool's concurrency safety declaration, how does the scheduler decide whether a tool can execute immediately? The logic is remarkably concise, just 6 lines of code (lines 129-135):
In plain language: a tool can execute if and only if one of the following two conditions holds:
- No tools are currently executing (idle state, any tool can start)
- The current tool is concurrency-safe, and all currently executing tools are also concurrency-safe
This logic implies an important corollary: as long as any non-concurrency-safe tool is executing, all other tools must wait. Non-concurrency-safe tools get exclusive access.
Let's visualize with a table:
| Currently Executing Tools | New Tool (safe) | New Tool (unsafe) |
|---|---|---|
| None (idle) | Can execute | Can execute |
| All safe | Can execute | Wait |
| Includes unsafe | Wait | Wait |
This is a classic read-write lock pattern: concurrency-safe tools are like read locks (multiple can coexist), non-concurrency-safe tools are like write locks (must be exclusive).
processQueue: The Subtleties of Queue Scheduling
The processQueue() method (lines 140-151) is responsible for traversing the queue and starting executable tools:
This code has an easily overlooked but critically important detail — the break statement. When it encounters a non-concurrency-safe tool that can't execute, the scheduler stops traversal. Why?
Consider the following tool sequence:
Without the break, the scheduler would skip Bash "git add ." when it can't execute and continue checking Read C. Read C is concurrency-safe and might be started. But this is problematic — Read C would execute before git add ., potentially reading file contents not yet staged.
The break ensures ordering between non-concurrency-safe tools. Once a queued non-concurrency-safe tool is encountered, no subsequent tools (safe or not) will be started.
Conversely: what if the tool that can't execute is a concurrency-safe one? It's simply skipped (continue) and doesn't prevent scheduling of subsequent tools. When would a concurrency-safe tool be unable to execute? When a non-concurrency-safe tool currently has exclusive access. Once the exclusive tool completes, all queued concurrency-safe tools can start together.
When processQueue Is Triggered
processQueue() is called in two places:
- In addTool() (line 123): every time a new tool is added, immediately try to schedule it.
- When executeTool() completes (lines 402-404): after a tool finishes, trigger a new round of scheduling.
This creates a self-driving loop: tool completes -> try to schedule -> new tool starts -> new tool completes -> schedule again... until the queue is empty.
Sibling AbortController: Cascading Cancellation of Errors
One of the trickiest problems with concurrent execution is error handling. When multiple tools are running in parallel, how should one tool's failure affect the others?
Claude Code's design is: only Bash tool errors cascade-cancel sibling tools. This design stems from a practical observation — Bash commands often have implicit dependency chains (mkdir fails, so the subsequent cd and touch are pointless), while Read, Grep, WebFetch and other tools are independent — one file read failure shouldn't affect another file's read.
Three-Layer AbortController Architecture
Error cascading relies on a carefully designed three-layer AbortController architecture:
Layer 1: Query-Level AbortController (toolUseContext.abortController)
This is the lifecycle controller for the entire query turn. When the user presses ESC or submits a new message, this controller is aborted, causing the entire turn to end.
Layer 2: Sibling-Level AbortController (siblingAbortController)
This is created by StreamingToolExecutor during construction as a child controller of the query-level controller (lines 59-61):
Key property: aborting the sibling-level controller does not abort the parent controller. This means a Bash error can cancel all sibling tools without terminating the entire query turn — the model will still receive the error information and continue reasoning.
Layer 3: Tool-Level AbortController (toolAbortController)
Each tool creates its own controller during execution as a child of the sibling-level controller (lines 301-302):
Bash Error Cascade Path
When a Bash tool execution fails, the complete cascade path is as follows (lines 354-363):
Execution flow:
- The Bash tool's execution result contains a
tool_resultwithis_error: true - The
hasErroredflag is set totrue erroredToolDescriptionrecords the description of the errored tool (e.g.,Bash(mkdir /tmp/test...))siblingAbortController.abort('sibling_error')is called- This abort signal propagates through
createChildAbortController's parent-child relationship to all other tools'toolAbortController - Executing tools that receive the abort signal generate synthetic error messages (lines 189-204)
Tool-Level Abort Upward Propagation
The tool-level AbortController has a subtle event listener (lines 304-317) that handles a special case — when permission dialog denial occurs:
This code means: if the tool is aborted for a reason other than a sibling error (such as permission denial), then this abort needs to bubble up to the query-level controller to terminate the entire turn. The code comments mention #21056 regression — this upward bubbling logic was added to fix a specific regression bug.
Synthetic Error Messages
Cancelled tools aren't simply discarded — they receive a synthetic error message so the model knows these tools didn't execute successfully. The createSyntheticErrorMessage method (lines 153-205) generates different error messages based on the cancellation reason:
Three cancellation reasons produce three different messages:
| Reason | Message Content | Purpose |
|---|---|---|
sibling_error | Cancelled: parallel tool call Bash(mkdir...) errored | Model knows which sibling tool failed |
user_interrupted | User rejected tool use + memory correction hint | Model knows the user actively cancelled |
streaming_fallback | Streaming fallback - tool execution discarded | Silent cancellation during streaming fallback |
Preventing Duplicate Error Messages
There's an elegant deduplication logic in the code — the thisToolErrored flag (lines 330-345):
If Tool A is a Bash tool that errors, it triggers siblingAbortController.abort(). At this point, getAbortReason() would also return sibling_error for Tool A itself. But because thisToolErrored has already been set to true, Tool A won't receive an additional synthetic error message — it already has its own real error result.
Progress Buffering and Ordered Emission
Concurrent execution introduces an output ordering problem. Suppose Tool 1 and Tool 2 are running in parallel, and Tool 2 finishes first — should its results be emitted before Tool 1's?
Claude Code's answer is to treat two types of output differently:
- Progress messages: emitted immediately, no ordering required
- Final results: must be emitted in tool addition order
Immediate Emission of Progress Messages
In the execution loop of the executeTool() method (lines 366-374), progress messages are stored in the pendingProgress array:
Note the progressAvailableResolve semaphore — when new progress messages arrive, it wakes up the waiting getRemainingResults().
Ordered Emission of Results
The getCompletedResults() method (lines 412-440) implements ordered emission logic:
The traversal logic in this code is quite elegant. Let's illustrate with an example:
Traversal process:
- Tool 1:
yielded, skip (but emit any pending progress first) - Tool 2:
completed, emit results, mark asyielded - Tool 3:
executing, concurrency-safe, don't break, continue traversal (emit pending progress) - Tool 4:
queued, doesn't match any condition, natural end
What if Tool 3 were non-concurrency-safe?
Traversal process:
- Tool 1:
yielded, skip - Tool 2:
completed, emit results - Tool 3:
executingand!isConcurrencySafe, break! - Tool 4's results will NOT be emitted, even though it's already completed
Why? Because the non-concurrency-safe tool's results may have changed the context (via contextModifiers), and Tool 4's results might depend on this modified context. So we must wait for Tool 3 to complete and the context to update before emitting Tool 4's results.
getRemainingResults Wait Mechanism
getRemainingResults() is an AsyncGenerator (lines 453-490) that continuously waits until all tools have finished:
Promise.race is the key — it simultaneously waits for two types of events:
- Any executing tool to complete
- Any tool to produce new progress messages
Whichever happens first wakes up the loop, allowing it to emit new results or progress. This implements an event-driven reactive loop — not polling, but passively waiting for notifications.
interruptBehavior: Strategy Selection on User Interruption
When a user presses ESC or submits a new message during tool execution, different tools should react differently. Some tools should stop immediately (like a long-running search), while others should continue running to completion (like a file write in progress — stopping midway could corrupt the file).
cancel vs block
The interruptBehavior method is defined at lines 408-416 of src/Tool.ts:
cancel: The tool can safely stop midway. On user interruption, a synthetic error message is generated and partial results are discarded.block: The tool is performing a non-interruptible operation. The user's new message must wait until this tool completes before being sent.
The default behavior is block, which is again a conservatively safe design.
Implementation in StreamingToolExecutor
The getAbortReason() method (lines 210-230) handles interruptBehavior:
Note the priority hierarchy here:
- First check
discarded(streaming fallback) — highest priority - Then check
hasErrored(sibling error) — second highest - Finally check the abort signal:
- If the reason is
'interrupt'(user submitted a new message), onlycanceltools will be cancelled - If the reason is something else (user pressed ESC), all tools will be cancelled
- If the reason is
Interruptible State Updates
The updateInterruptibleState() method (lines 254-260) maintains a global state that tells the UI whether all tools can currently be interrupted:
Only when all executing tools are of the cancel type does the UI show an "interruptible" indicator. If any block tool is running, the entire turn is considered non-interruptible.
Discardable Mode: Tool Discard During Streaming Fallback
Claude Code uses streaming to receive model responses, but streaming can fail (network errors, server issues, etc.). When a streaming fallback occurs, the executor needs to discard results from tools that have already started but haven't completed.
The discard() method (lines 69-71) is very simple:
It only sets a flag. This flag propagates to all tools through getAbortReason():
- Queued tools:
processQueue()->executeTool()-> detects abort reason -> immediately generates synthetic error - Executing tools: detects abort reason in the next iteration loop -> generates synthetic error and breaks
- Completed tools:
getCompletedResults()checksthis.discardedand returns immediately
getRemainingResults() also checks this.discarded (lines 454-456):
This guarantees that after a streaming fallback, no residual results leak into subsequent processing.
Complete Execution Flow
Let's tie all the components together with an end-to-end example. Suppose the model returns the following tool calls:
Phase 1: Concurrent Reads
Three concurrency-safe tools begin executing simultaneously.
Phase 2: Bash Queued
The Bash tool enters the queue.
Phase 3: Edit Queued
The Edit tool also enters the queue. Encountering the queued Bash (unsafe), break prevents further processing.
Phase 4: Reads Complete, Bash Starts
When the first read completes, it triggers processQueue(). At this point all reads are completed or completing, and Bash is the first queued tool. Whether Bash can execute depends on whether any other tools are still executing. Assuming all reads happen to have completed by time step 5, Bash can begin exclusive execution.
Phase 5: Ordered Result Emission
Results are emitted strictly in the order tools were added.
Exception Path: Bash Fails
If npm test fails:
The Edit tool won't be executed, and the model will receive two error messages — one with Bash's real error, and one with Edit's cancellation notice. The model can then decide its next steps based on this information.
Comparison with toolOrchestration
There's another tool orchestration implementation in src/services/tools/toolOrchestration.ts called runTools(). How does it differ from StreamingToolExecutor?
runTools() uses a partition-batch model (lines 19-80):
It first partitions all tool calls by concurrency safety, then executes them batch by batch. This is a simpler model — but it requires all tool calls to be known before execution begins.
StreamingToolExecutor's advantage is its support for incremental addition — tool calls are added one by one as the streaming response arrives, without waiting for all tool calls to be parsed. This is critical in streaming scenarios, because the model may still be generating the 5th tool call while the first 3 can already start executing.
| Feature | runTools() | StreamingToolExecutor |
|---|---|---|
| Tool addition timing | All at once | Incremental |
| Scheduling strategy | Partition-batch | Real-time queue scheduling |
| Progress messages | No special handling | Separate storage, immediate emission |
| Error cascading | None | Sibling AbortController |
| Discard mode | None | Supported |
| Interrupt behavior | None | cancel/block strategy |
Memory Safety of createChildAbortController
StreamingToolExecutor makes extensive use of createChildAbortController() (defined in src/utils/abortController.ts). This utility method deserves a closer look because it solves an easily overlooked memory leak problem.
The standard parent-child AbortController relationship is typically implemented like this:
The problem is: parent holds a strong reference to child through the closure. Even if child is discarded at the application level, as long as parent is alive, child can't be garbage collected. In StreamingToolExecutor, each tool creates a toolAbortController (child), while siblingAbortController (parent) lives throughout the entire tool execution phase. If the model returns 20 tool calls, there are 20 children strongly held by the parent.
createChildAbortController() solves this with WeakRef (lines 68-99):
Key design decisions:
- WeakRef holds child: The parent's event listener references child through
WeakRef, not preventing GC - WeakRef holds parent: The child's cleanup logic also references parent through
WeakRef, avoiding reverse strong references - Auto-cleanup: When child is aborted, it automatically removes its listener from parent, preventing listener accumulation
{once: true}: Ensures the event handler is called only once
These measures ensure no memory leaks occur in high-concurrency tool execution scenarios.
Transferable Patterns: Implementing Similar Architecture in Your Projects
StreamingToolExecutor's concurrency model isn't unique to Claude Code — it's fundamentally a declarative concurrency scheduler. If you need to implement similar tool orchestration in your own projects, here are the core patterns you can adopt:
Pattern 1: Self-Declared Concurrency Safety
Let each operation declare for itself whether it can run in parallel, rather than hard-coding rules in the scheduler:
Benefit: the scheduler doesn't need to understand the details of each operation, and adding new operations doesn't require modifying the scheduler code.
Pattern 2: Read-Write Lock Scheduling
Pattern 3: Layered AbortController
Pattern 4: Separating Progress from Results
Pattern 5: Conservative Defaults
In safety-related scenarios, always make the default behavior the most conservative. Tool developers must proactively declare safety, rather than safety being assumed by default.
Complete Mini Implementation
Combining the patterns above, a minimal viable concurrency scheduler is roughly 200 lines of code:
Design Trade-offs
Looking back at the entire StreamingToolExecutor design, there are several trade-offs worth discussing:
Why Do Only Bash Errors Cascade?
The code comment says it clearly (lines 357-359):
Bash commands often have implicit dependency chains (e.g. mkdir fails -> subsequent commands pointless). Read/WebFetch/etc are independent — one failure shouldn't nuke the rest.
This is a pragmatic choice. In theory, each tool could declare "whether my errors should cascade," but in practice, only the Bash tool has this kind of implicit dependency relationship. Over-engineering would only increase the cognitive burden on tool developers.
Why Not Support contextModifier for Concurrent Tools?
The code comment (lines 389-390) acknowledges this is a feature gap:
NOTE: we currently don't support context modifiers for concurrent tools. None are actively being used, but if we want to use them in concurrent tools, we need to support that here.
Concurrent tools modifying shared context requires solving race conditions — what happens when two tools simultaneously modify the same context field? The current approach simply prohibits it, waiting for actual demand before designing a solution. This is a textbook application of "YAGNI" (You Aren't Gonna Need It).
Why Does interruptBehavior Default to block?
Because cancelling a write operation midway could cause data corruption. block means "let the tool finish," which in the worst case only means waiting a few more seconds. cancel in the worst case could result in a half-written file. Safety > performance.
Why Generators Instead of Callbacks?
getCompletedResults() returns a Generator, and getRemainingResults() returns an AsyncGenerator. This design lets callers naturally consume results using for...of and for await...of, without needing to register callbacks. The lazy evaluation property of Generators also means unneeded results won't be computed.
Summary
StreamingToolExecutor is an elegant concurrency orchestration component in Claude Code that solves the seemingly simple but actually complex problem of "letting AI operate multiple tools simultaneously." Its core design principles include:
- Self-declared concurrency safety: Tools know whether they can run in parallel; the scheduler merely executes their declarations
- Read-write lock scheduling: Concurrency-safe tools share access, non-concurrency-safe tools get exclusive access
- Layered cancellation: Three-layer AbortController architecture for precise error cascading
- Ordered emission: Progress is immediately visible, results are output in order
- Conservative defaults: Without a declaration, assume unsafe and non-interruptible
These principles apply not only to AI tool orchestration but to any system requiring mixed concurrency strategies — database operation scheduling, microservice orchestration, CI/CD pipeline management, and more. The 530 lines of StreamingToolExecutor distill the core wisdom of production-grade concurrency orchestration.
In the next article, we'll dive into the permission system — exploring how Claude Code ensures every tool call undergoes a security review through its six-layer evaluation chain.