The Big Picture: How a 512,000-Line AI CLI Is Built

Starting from the leak, we take our first look inside Claude Code's massive codebase and build a complete mental model from the CLI entry point to the LLM engine

Overview

On March 31, 2026, security researcher Chaofan Shou discovered that Anthropic's npm registry had exposed a .map file containing the complete, unobfuscated TypeScript source code for the Claude Code CLI. This source code comprises roughly 1,900 files and over 512,000 lines of code — this is not a simple command-line tool, but a full-fledged AI agent platform with exceptional engineering complexity.

This article is the first in the series. We won't dive into the implementation details of any single module (later articles will do that). Instead, we'll take a bird's-eye view of the entire codebase: What tech stack was chosen? Why were these choices made? How is the code organized? What layers does a user interaction pass through from input to output? Understanding these big-picture questions is a prerequisite for diving into any subsystem.

If you're a developer building AI tools, this article will help you understand the architectural blueprint of a production-grade AI CLI. If you're simply curious about how Claude Code works internally, this article will give you a clear overview of the whole system.


Technology Choices: Why This Combination?

Opening up Claude Code's source code, the first surprising discovery is its tech stack:

CategoryTechnology
RuntimeBun
LanguageTypeScript (strict mode)
Terminal UIReact + Ink
CLI ParsingCommander.js (extra-typings)
Schema ValidationZod v4
Code Searchripgrep
ProtocolsMCP SDK, LSP
APIAnthropic SDK
TelemetryOpenTelemetry + gRPC
AuthenticationOAuth 2.0, JWT, macOS Keychain

Why Bun Instead of Node.js?

Bun was chosen here not only for its startup speed (critical for CLI tools), but also for a key feature: compile-time Feature Flags.

src/main.tsx
TypeScript
1import { feature } from 'bun:bundle'
2
3const coordinatorModeModule = feature('COORDINATOR_MODE')
4 ? require('./coordinator/coordinatorMode.js')
5 : null
6
7const assistantModule = feature('KAIROS')
8 ? require('./assistant/index.js')
9 : null

When feature('COORDINATOR_MODE') resolves to false at build time, Bun's bundler completely removes the entire require() branch — along with all its transitive dependencies — from the final output. This is not a runtime if check, but compile-time Dead Code Elimination. For a tool that needs to support multiple configurations simultaneously — standalone CLI mode, IDE integration mode (BRIDGE_MODE), voice mode (VOICE_MODE), background daemon mode (DAEMON), and more — this means each build artifact contains only the code it actually needs.

Why Build a Command-Line Interface with React?

This may be the most counterintuitive choice. React was designed for the browser — but Claude Code's terminal interface is far more complex than a typical CLI: it needs to render streaming AI responses in real time, display tool execution progress bars, present file diffs, and show interactive permission approval dialogs. These interaction patterns are more similar to a web application than a traditional command line.

Ink swaps React's render target from the browser DOM to terminal characters. This means Claude Code can use React's component model, Hook system, and state management to build its UI, while outputting to the terminal. The src/components/ directory contains over 140 React components, from message rendering to permission dialogs, from file diff displays to progress indicators.


Directory Structure and Layered Architecture

Claude Code's src/ directory contains 33 top-level subdirectories. At first glance this looks daunting, but they map cleanly to a 5-layer architecture model:

Entry Layer
main.tsx
Commander.js CLI Parsing
Command Layer
commands/
50+ Slash Commands
commands.ts
Command Registry
Tool Layer
tools/
40+ Agent Tools
tools.ts
Tool Registry
Tool.ts
Tool Type Definitions
Engine Layer
QueryEngine.ts
Session Management
query.ts
Streaming Query Loop
Service Layer
services/api/
Anthropic API
services/mcp/
MCP Protocol
services/compact/
Context Compaction
services/oauth/
Authentication

Let's walk through each layer:

Entry Layer: main.tsx

Everything starts with main.tsx. This file does three key things:

  1. Parses CLI arguments — uses Commander.js to handle commands like claude --model sonnet "fix the bug"
  2. Initializes the runtime — loads configuration, establishes API connections, sets up telemetry
  3. Starts the Ink render loop — mounts the React component tree to the terminal

But the most interesting part is not what it does, but how it does it — the startup sequence is carefully optimized for parallel execution:

src/main.tsx:12-20
TypeScript
12// These calls execute before any heavy imports
13profileCheckpoint('main_tsx_entry')
14startMdmRawRead() // Read MDM config in parallel
15startKeychainPrefetch() // Prefetch Keychain credentials in parallel

Before loading the rest of the modules, main.tsx has already kicked off MDM (Mobile Device Management) configuration reading and macOS Keychain credential prefetching. These two I/O operations run in parallel, rather than being called serially when needed. For a CLI tool that needs to start up quickly, this "start early, consume later" pattern is a critical performance optimization.

Command Layer: commands.ts + commands/

When a user types slash commands like /commit, /review, or /compact, commands.ts routes them to the corresponding implementation. The command registry uses the same Feature Flag pattern as the tool system:

src/commands.ts:62-122
TypeScript
62// Conditional imports: disabled commands are eliminated at build time
63import { feature } from 'bun:bundle'
64
65// When VOICE_MODE is off, the entire voice command code is absent from the final build
66// When BRIDGE_MODE is off, IDE integration commands are removed

Note an interesting lazy loading pattern — for particularly heavy commands (like insights, a single 113KB file), Claude Code uses runtime dynamic imports to avoid loading them at startup:

src/commands.ts:190-200
TypeScript
190const usageReport: Command = {
191 type: 'prompt',
192 name: 'insights',
193 async getPromptForCommand(args, context) {
194 // The 113KB module is only loaded when the user actually runs /insights
195 const real = (await import('./commands/insights.js')).default
196 return real.getPromptForCommand(args, context)
197 }
198}

This is a combination of compile-time elimination and runtime lazy loading: unneeded features are removed at compile time, while features that are needed but infrequently used are loaded lazily at runtime.

Tool Layer: Tool.ts + tools.ts + tools/

Tools are one of Claude Code's most central concepts. Each tool represents an operation the AI can perform — reading files, writing files, executing shell commands, searching code, visiting web pages, and more. The tool system will be analyzed in depth in Article 03; here you only need to understand its place and responsibilities:

  • Tool.ts (792 lines) — defines the tool type system and permission model
  • tools.ts — the tool registry, the single source of truth for all available tools
  • tools/ — 45 subdirectories, each containing a complete tool implementation
src/tools.ts
TypeScript
1// getAllBaseTools() is the system's complete tool manifest
2// It uses conditional imports and lazy requires to manage dependencies
3
4// Feature-gated tool example:
5const cronTools = feature('AGENT_TRIGGERS')
6 ? [
7 require('./tools/ScheduleCronTool/CronCreateTool.js').CronCreateTool,
8 require('./tools/ScheduleCronTool/CronDeleteTool.js').CronDeleteTool,
9 require('./tools/ScheduleCronTool/CronListTool.js').CronListTool,
10 ]
11 : []
12
13// Lazy require to break circular dependencies:
14const getTeamCreateTool = () =>
15 require('./tools/TeamCreateTool/TeamCreateTool.js').TeamCreateTool

Engine Layer: QueryEngine.ts + query.ts

This is the heart of Claude Code. QueryEngine.ts (1,295 lines) manages the state of the entire conversation session — message history, file cache, token counts, and permission records. query.ts (1,729 lines) implements the streaming query loop — an async-generator-driven state machine responsible for calling the API, handling tool calls, and executing recovery strategies.

The engine layer will be fully analyzed in Article 02. For now, just know this:

核心数据流(简化)
User message QueryEngine.submitMessage()
query() async generator
API streaming call
tool_use detection tool execution result injection continue generation
Final response

Service Layer: services/

The service layer provides the infrastructure capabilities that the engine and tools need:

ServicePathResponsibility
API Clientservices/api/Anthropic API calls, streaming responses, retries
MCP Protocolservices/mcp/Model Context Protocol server connection management
Context Compactionservices/compact/Conversation history compaction to prevent exceeding context windows
Authenticationservices/oauth/OAuth 2.0 flow, token refresh
Telemetryservices/analytics/GrowthBook Feature Flags, user segmentation
LSPservices/lsp/Language Server Protocol integration
Pluginsservices/plugins/Plugin loading and management

Core Data Flow: The Journey of a Single Interaction

Now that we understand the layered architecture, let's trace a complete user interaction — from input to output — and see how data flows through each layer:

sequenceDiagram
    participant U as User
    participant M as main.tsx
    participant QE as QueryEngine
    participant Q as query()
    participant API as Anthropic API
    participant T as Tool Execution

    U->>M: Input message or /command
    M->>M: Decide: Slash command or regular message?

    alt Slash Command
        M->>M: commands.ts routes to the corresponding command
        M-->>U: Command execution result
    else Regular Message
        M->>QE: submitMessage(prompt)
        QE->>QE: Build ProcessUserInputContext
        QE->>QE: Compose System Prompt
        QE->>Q: Start query() async generator

        loop Streaming Loop
            Q->>API: Send messages + tool definitions
            API-->>Q: Streaming response (text + tool_use)
            Q-->>U: Stream text output

            opt Contains tool_use
                Q->>T: Execute tool (possibly in parallel)
                T-->>Q: Tool result
                Q->>Q: Inject result into message history
                Note over Q: Continue loop
            end
        end

        Q-->>QE: Final response
        QE-->>U: Conversation complete
    end

There are several key design decisions in this flow:

  1. Streaming output: The AI's text response is streamed to the terminal as it is generated — the user doesn't have to wait for the complete response
  2. Tool call loop: The LLM can invoke multiple tools in a single response; tool results are injected back into the message history, and the LLM continues generating based on the new information
  3. Parallel tool execution: Multiple non-conflicting tools can be executed in parallel (managed by StreamingToolExecutor)

Key Design Philosophies

Throughout the codebase, several design patterns appear repeatedly. Understanding them will help you grasp the design intent of each subsystem more quickly in subsequent articles:

1. Parallel Prefetch

Don't wait at startup — kick off I/O as early as possible:

Sequential Approach (Slow)
Start
Read Keychain
Ready
Read MDM
Init GrowthBook
Total ≈ 5 steps
Parallel Approach (What Claude Code Actually Does)
Start
Read MDMRead KeychainInit GrowthBook
Ready
Total ≈ 3 steps (parallel = faster)

2. Lazy Loading

Heavy modules are deferred until first use:

  • OpenTelemetry (~400KB) — loaded on the first telemetry event
  • gRPC (~700KB) — loaded when gRPC transport is first needed
  • Large command modules — loaded only when the user actually executes the command

3. Compile-Time Dead Code Elimination

Code paths that aren't needed are completely removed at compile time via feature() flags. Known flags include:

FlagFeature It Controls
COORDINATOR_MODEMulti-agent coordinator
KAIROSAdvanced agent capabilities
BRIDGE_MODEIDE integration
VOICE_MODEVoice input
DAEMONBackground daemon mode
PROACTIVEProactive mode (SleepTool)
AGENT_TRIGGERSRemote triggers and scheduled tasks
BUDDYCompanion easter egg

4. Minimalist State Management

No Redux, no MobX, no Zustand. Claude Code's global state management is built on a custom Store implementation in under 35 lines:

src/state/store.ts
TypeScript
1export function createStore<T>(
2 initialState: T,
3 onChange?: OnChange<T>,
4): Store<T> {
5 let state = initialState
6 const listeners = new Set<Listener>()
7
8 return {
9 getState: () => state,
10 setState: (updater: (prev: T) => T) => {
11 const prev = state
12 const next = updater(prev)
13 if (Object.is(next, prev)) return
14 state = next
15 onChange?.({ newState: next, oldState: prev })
16 for (const listener of listeners) listener()
17 },
18 subscribe: (listener: Listener) => {
19 listeners.add(listener)
20 return () => listeners.delete(listener)
21 },
22 }
23}

Three methods — getState, setState, subscribe — plus an Object.is() referential equality check. That's all it takes.


What Else Is There?

Beyond this overview, Claude Code's codebase contains many more fascinating subsystems, each of which will be analyzed in detail in subsequent articles:

SubsystemPathSummary
Bridgesrc/bridge/34 files, 1MB+ of code implementing bidirectional communication between CLI and IDE
Coordinatorsrc/coordinator/Multi-agent orchestration — dispatcher/worker pattern
Memorysrc/memdir/File-system-based persistent memory — four types, automatic extraction
Skillssrc/skills/Extensible skill system using Markdown frontmatter as configuration
Pluginssrc/plugins/Two-tier registered plugin architecture
Inksrc/ink/Terminal UI rendering engine spanning 50 files
Vimsrc/vim/Vim modal editing implemented as an exhaustive-type state machine
Buddysrc/buddy/Virtual companion easter egg driven by deterministic random numbers

Next Up

Now that we've established a big-picture architectural understanding, Article 02: The Query Engine will dive deep into Claude Code's most critical engine layer — QueryEngine.ts and query.ts — tracing the complete lifecycle of a conversation from user input to final response. We'll see how async generators drive the streaming query loop, and how the engine gracefully recovers when things go wrong (context overflow, API timeouts, model refusals).