Persistent Memory: How AI Remembers You Across Sessions

A deep dive into Claude Code's memory system — MEMORY.md indexing, four memory types, automatic extraction, memory injection and deduplication, and team memory sync

The Problem

Every time you start a new conversation, the AI starts from scratch. It doesn't know who you are, what tech stack your project uses, or what behaviors you corrected last time. You have to repeatedly tell it "don't use mocks in tests," "I'm a backend engineer, don't give me CSS 101," "please submit PRs to the develop branch." This isn't a conversation — it's retraining an amnesiac assistant from scratch every time.

Claude Code's memory system (internally codenamed memdir, short for memory directory) solves this problem at its root. It maintains a structured set of persistent memories on the filesystem, allowing the AI to load your preferences, project context, and historical feedback at the start of every new session. Going further, it can automatically extract content worth remembering during conversations, without you having to manually say "please remember this."

This article takes a deep dive into the source code of src/memdir/ and src/services/extractMemories/, dissecting the design and implementation of this system layer by layer.

memdir Filesystem Design: A Two-Level Structure

Claude Code's memories aren't stored in a database or serialized into some JSON blob. It uses a filesystem-as-database design — each memory is an independent Markdown file, linked together by an index file called MEMORY.md.

Directory Layout

Text
1~/.claude/projects/<sanitized-project-root>/memory/
2 MEMORY.md # Index file, one pointer per line
3 user_role.md # Individual memory file
4 feedback_testing.md # Individual memory file
5 project_auth_rewrite.md # Individual memory file
6 reference_linear_project.md # Individual memory file
7 team/ # Team shared memories (feature flag controlled)
8 MEMORY.md
9 feedback_no_mocks.md
10 project_merge_freeze.md

This path is computed by getAutoMemPath() in src/memdir/paths.ts:

TypeScript
1// src/memdir/paths.ts, lines 223-235
2export const getAutoMemPath = memoize(
3 (): string => {
4 const override = getAutoMemPathOverride() ?? getAutoMemPathSetting()
5 if (override) {
6 return override
7 }
8 const projectsDir = join(getMemoryBaseDir(), 'projects')
9 return (
10 join(projectsDir, sanitizePath(getAutoMemBase()), AUTO_MEM_DIRNAME) + sep
11 ).normalize('NFC')
12 },
13 () => getProjectRoot(),
14)

The path resolution priority chain is clear:

  1. CLAUDE_COWORK_MEMORY_PATH_OVERRIDE environment variable — full path override for Cowork scenarios
  2. autoMemoryDirectory in settings.json — user-level configuration (supports ~/ expansion)
  3. Default path ~/.claude/projects/<sanitized-git-root>/memory/

Note that findCanonicalGitRoot is used here to ensure all worktrees of the same repository share a single memory directory — a subtle but important design decision.

MEMORY.md: An Index, Not Content

MEMORY.md is a plain text index file where each line is a link pointing to a specific memory file. The format requirements are strict:

MARKDOWN
1- [User Role](user_role.md) — Backend engineer, proficient in Go, React beginner
2- [Testing Strategy Feedback](feedback_testing.md) — Don't use mocks in integration tests
3- [Auth Rewrite Project](project_auth_rewrite.md) — Compliance-driven, not tech debt
4- [Linear Project Tracking](reference_linear_project.md) — Pipeline bugs in INGEST project

Each line is no more than ~150 characters, containing only a title and a one-line hook description. Never write memory content directly in MEMORY.md — content goes in individual files.

The motivation behind this two-level structure is practical: MEMORY.md is loaded entirely into the context at the start of every session, so it must stay lean. If all memory content were crammed in here, the context window would be exhausted quickly.

...

MEMORY.md Constraints: 200 Lines / 25KB

The index file cannot grow indefinitely. Two hard constraints are defined in src/memdir/memdir.ts:

TypeScript
1// src/memdir/memdir.ts, lines 34-38
2export const ENTRYPOINT_NAME = 'MEMORY.md'
3export const MAX_ENTRYPOINT_LINES = 200
4// ~125 chars/line at 200 lines. At p97 today; catches long-line indexes that
5// slip past the line cap (p100 observed: 197KB under 200 lines).
6export const MAX_ENTRYPOINT_BYTES = 25_000

200 lines is the line count limit, and 25KB is the byte limit. These are independent constraints — even if the line count is under 200, if some lines are exceptionally long causing the total bytes to exceed 25KB, truncation is triggered. The comment on the byte limit specifically explains why: someone wrote an index file under 200 lines that totaled 197KB because each line was extremely long.

The truncation logic is implemented by truncateEntrypointContent():

TypeScript
1// src/memdir/memdir.ts, lines 57-103
2export function truncateEntrypointContent(raw: string): EntrypointTruncation {
3 const trimmed = raw.trim()
4 const contentLines = trimmed.split('\n')
5 const lineCount = contentLines.length
6 const byteCount = trimmed.length
7
8 const wasLineTruncated = lineCount > MAX_ENTRYPOINT_LINES
9 const wasByteTruncated = byteCount > MAX_ENTRYPOINT_BYTES
10
11 if (!wasLineTruncated && !wasByteTruncated) {
12 return {
13 content: trimmed,
14 lineCount,
15 byteCount,
16 wasLineTruncated,
17 wasByteTruncated,
18 }
19 }
20
21 let truncated = wasLineTruncated
22 ? contentLines.slice(0, MAX_ENTRYPOINT_LINES).join('\n')
23 : trimmed
24
25 if (truncated.length > MAX_ENTRYPOINT_BYTES) {
26 const cutAt = truncated.lastIndexOf('\n', MAX_ENTRYPOINT_BYTES)
27 truncated = truncated.slice(0, cutAt > 0 ? cutAt : MAX_ENTRYPOINT_BYTES)
28 }
29
30 // ...construct WARNING message
31 return {
32 content:
33 truncated +
34 `\n\n> WARNING: ${ENTRYPOINT_NAME} is ${reason}. Only part of it was loaded...`,
35 // ...
36 }
37}

The truncation strategy is deliberate: first truncate by lines (natural boundaries), then if bytes still exceed the limit, find the last newline before the limit and cut there, avoiding splitting a line in the middle. After truncation, a WARNING is appended at the end, telling the model the index was truncated and should be kept concise.

Four Memory Types

Memories aren't an undifferentiated pile of text. Claude Code defines a closed four-type taxonomy, where each type has clear guidelines for when to save, how to use, and what content structure to follow:

TypeScript
1// src/memdir/memoryTypes.ts, lines 14-19
2export const MEMORY_TYPES = [
3 'user',
4 'feedback',
5 'project',
6 'reference',
7] as const
8
9export type MemoryType = (typeof MEMORY_TYPES)[number]

user: Understanding the User

Records the user's role, goals, responsibilities, and knowledge background. The core purpose is to let the AI adjust its behavior based on a user profile — the collaboration style for a senior backend engineer and a programming beginner should be fundamentally different.

When to save: When you learn about the user's role, preferences, responsibilities, or knowledge domain.

Use case: When work needs to be adapted based on the user's profile. For example, if a user asks you to explain a piece of code, you should choose the depth and angle of explanation based on their background.

Example:

Text
1User: I've written Go for ten years, but this is my first time touching React code in this repo
2AI: [Save user memory: Deep Go experience, React beginner — use backend analogies to explain frontend concepts]

feedback: Behavioral Guidance

Records user feedback on the AI's working methods — including both corrections and affirmations. This is particularly important: if only corrections are recorded, the AI becomes increasingly conservative, afraid to repeat approaches the user has actually approved.

When to save: When the user corrects your approach ("don't do that") or confirms a non-obvious approach works ("exactly, keep going").

Content structure: Write the rule itself first, then a Why: line (the reason the user gave), followed by a How to apply: line (in what scenarios this rule applies). Knowing "why" enables correct judgment in edge cases.

Example:

Text
1User: Don't mock the database in these tests — last quarter the mock tests all passed but the production migration broke
2AI: [Save feedback memory: Integration tests must use real databases. Why: Mock/prod divergence caused migration failure. How to apply: All test files involving databases]

project: Project Dynamics

Records information about ongoing work, goals, bugs, and incidents — things that cannot be derived from code or git history.

When to save: When you learn who is doing what, why, and what the deadline is. Note that relative dates should be converted to absolute dates ("next Thursday" -> "2026-04-02") so the memory remains understandable as time passes.

Example:

Text
1User: After Thursday, freeze all non-critical merges — the mobile team needs to cut a release branch
2AI: [Save project memory: Merge freeze starting 2026-04-02. Why: Mobile release branch cut]

reference: External Resource Pointers

Stores pointers to information locations in external systems — letting the AI know where to find the latest information.

Example:

Text
1User: Pipeline bugs are all tracked in the "INGEST" project on Linear
2AI: [Save reference memory: Pipeline bugs in Linear project "INGEST"]

Type Parsing

Type information is validated through a parser function that gracefully handles legacy files and unknown types:

TypeScript
1// src/memdir/memoryTypes.ts, lines 28-31
2export function parseMemoryType(raw: unknown): MemoryType | undefined {
3 if (typeof raw !== 'string') return undefined
4 return MEMORY_TYPES.find(t => t === raw)
5}

Invalid or missing types return undefined — old files won't crash, and new files with incorrect types simply degrade gracefully.

Cannot be derived from code
user
feedback
project
reference
Total ≈ 4 steps (parallel = faster)
Can be derived from code - do not save
Code patterns
Architecture design
Git history
CLAUDE.md
Total ≈ 4 steps (parallel = faster)

What Not to Save

Code patterns, project structure, architecture design, git history, debugging solutions, content already in CLAUDE.md, and temporary task state — all of these are "derivable from the current project state" and should not be stored as memories. Even if a user explicitly asks to save a PR list or activity summary, you should follow up with "what here is surprising or non-obvious?" — only that part is worth saving.

Frontmatter Metadata Format

Each individual memory file uses standard YAML frontmatter:

MARKDOWN
1---
2name: {{memory name}}
3description: {{one-line description — used to determine relevance in future conversations, so be specific}}
4type: {{user, feedback, project, reference}}
5---
6
7{{memory content — for feedback/project types, recommend structuring as: rule/fact + **Why:** + **How to apply:**}}

The description field is particularly critical — it's not just a human-readable note, but the core basis used by the memory retrieval system (findRelevantMemories) to determine whether a memory is relevant to the current query. A good description should be specific enough to distinguish context, such as "Don't use database mocks in tests — lesson from compliance migration failure" rather than "testing-related feedback."

The frontmatter format example is defined in memoryTypes.ts:

TypeScript
1// src/memdir/memoryTypes.ts, lines 261-271
2export const MEMORY_FRONTMATTER_EXAMPLE: readonly string[] = [
3 '```markdown',
4 '---',
5 'name: {{memory name}}',
6 'description: {{one-line description — used to decide relevance...}}',
7 `type: {{${MEMORY_TYPES.join(', ')}}}`,
8 '---',
9 '',
10 '{{memory content — for feedback/project types, structure as: ...}}',
11 '```',
12]

Memory Scanning and Directory Management

memoryScan: Scanning Memory Files

src/memdir/memoryScan.ts provides directory scanning primitives shared by both the retrieval and extraction paths:

TypeScript
1// src/memdir/memoryScan.ts, lines 13-19
2export type MemoryHeader = {
3 filename: string
4 filePath: string
5 mtimeMs: number
6 description: string | null
7 type: MemoryType | undefined
8}

scanMemoryFiles() recursively scans all .md files in the directory (excluding MEMORY.md), reads the first 30 lines of frontmatter from each file, then sorts by modification time in descending order, returning at most 200 entries:

TypeScript
1// src/memdir/memoryScan.ts, lines 35-77
2export async function scanMemoryFiles(
3 memoryDir: string,
4 signal: AbortSignal,
5): Promise<MemoryHeader[]> {
6 try {
7 const entries = await readdir(memoryDir, { recursive: true })
8 const mdFiles = entries.filter(
9 f => f.endsWith('.md') && basename(f) !== 'MEMORY.md',
10 )
11
12 const headerResults = await Promise.allSettled(
13 mdFiles.map(async (relativePath): Promise<MemoryHeader> => {
14 const filePath = join(memoryDir, relativePath)
15 const { content, mtimeMs } = await readFileInRange(
16 filePath, 0, FRONTMATTER_MAX_LINES, undefined, signal,
17 )
18 const { frontmatter } = parseFrontmatter(content, filePath)
19 return {
20 filename: relativePath,
21 filePath,
22 mtimeMs,
23 description: frontmatter.description || null,
24 type: parseMemoryType(frontmatter.type),
25 }
26 }),
27 )
28
29 return headerResults
30 .filter((r): r is PromiseFulfilledResult<MemoryHeader> =>
31 r.status === 'fulfilled')
32 .map(r => r.value)
33 .sort((a, b) => b.mtimeMs - a.mtimeMs)
34 .slice(0, MAX_MEMORY_FILES)
35 } catch {
36 return []
37 }
38}

A design highlight: readFileInRange reads only the first 30 lines of each file rather than the entire thing, and readFileInRange internally returns mtimeMs, eliminating the need for a separate stat call — in the common case (N <= 200), this cuts the number of system calls in half.

Scan results can also be formatted as a text manifest for use in retrieval and extraction prompts:

TypeScript
1// src/memdir/memoryScan.ts, lines 84-94
2export function formatMemoryManifest(memories: MemoryHeader[]): string {
3 return memories
4 .map(m => {
5 const tag = m.type ? `[${m.type}] ` : ''
6 const ts = new Date(m.mtimeMs).toISOString()
7 return m.description
8 ? `- ${tag}${m.filename} (${ts}): ${m.description}`
9 : `- ${tag}${m.filename} (${ts})`
10 })
11 .join('\n')
12}

ensureMemoryDirExists: Directory Guarantee

Called only once per session (cached via systemPromptSection), this ensures the memory directory exists so the model doesn't need to run mkdir or check for directory existence when writing files:

TypeScript
1// src/memdir/memdir.ts, lines 129-147
2export async function ensureMemoryDirExists(memoryDir: string): Promise<void> {
3 const fs = getFsImplementation()
4 try {
5 await fs.mkdir(memoryDir)
6 } catch (e) {
7 const code =
8 e instanceof Error && 'code' in e && typeof e.code === 'string'
9 ? e.code
10 : undefined
11 logForDebugging(
12 `ensureMemoryDirExists failed for ${memoryDir}: ${code ?? String(e)}`,
13 { level: 'debug' },
14 )
15 }
16}

The prompt even explicitly tells the model "the directory already exists — write to it directly with the Write tool, don't run mkdir or check for existence":

TypeScript
1// src/memdir/memdir.ts, lines 116-119
2export const DIR_EXISTS_GUIDANCE =
3 'This directory already exists — write to it directly with the Write tool ' +
4 '(do not run mkdir or check for its existence).'

A comment explains why this is necessary: "Claude used to spend several turns running ls and mkdir -p before writing files."

Automatic Memory Extraction: extractMemories

This is the most sophisticated part of the memory system. Claude Code doesn't require you to manually say "remember this" — it has a background agent that automatically analyzes conversation content at the end of each exchange, extracting memories worth persisting.

Trigger Timing

The extraction agent runs at the end of each complete query cycle (when the model produces a final response with no more tool calls), triggered via handleStopHooks:

TypeScript
1// src/services/extractMemories/extractMemories.ts, lines 598-603
2export async function executeExtractMemories(
3 context: REPLHookContext,
4 appendSystemMessage?: AppendSystemMessageFn,
5): Promise<void> {
6 await extractor?.(context, appendSystemMessage)
7}

Mutual Exclusion with the Main Agent

A key design is the mutual exclusion between the extraction agent and the main agent: if the main agent has already written memory files during the conversation, the extraction agent skips that range and only advances the cursor:

TypeScript
1// src/services/extractMemories/extractMemories.ts, lines 121-148
2function hasMemoryWritesSince(
3 messages: Message[],
4 sinceUuid: string | undefined,
5): boolean {
6 let foundStart = sinceUuid === undefined
7 for (const message of messages) {
8 if (!foundStart) {
9 if (message.uuid === sinceUuid) {
10 foundStart = true
11 }
12 continue
13 }
14 if (message.type !== 'assistant') {
15 continue
16 }
17 const content = (message as AssistantMessage).message.content
18 if (!Array.isArray(content)) {
19 continue
20 }
21 for (const block of content) {
22 const filePath = getWrittenFilePath(block)
23 if (filePath !== undefined && isAutoMemPath(filePath)) {
24 return true
25 }
26 }
27 }
28 return false
29}

This mutual exclusion prevents duplicate writes — memories written by the main agent won't be written again by the background agent.

Forked Agent Mode

The extraction agent runs using runForkedAgent — a "perfect fork" of the main conversation that shares the parent's prompt cache. This means the extraction agent doesn't need to resend the entire conversation history, dramatically reducing token costs:

TypeScript
1// src/services/extractMemories/extractMemories.ts, lines 415-427
2const result = await runForkedAgent({
3 promptMessages: [createUserMessage({ content: userPrompt })],
4 cacheSafeParams,
5 canUseTool,
6 querySource: 'extract_memories',
7 forkLabel: 'extract_memories',
8 skipTranscript: true,
9 maxTurns: 5,
10})

Note the hard limit of maxTurns: 5 — this prevents the extraction agent from falling into a "verification rabbit hole" (e.g., reading source code to confirm whether a certain pattern actually exists).

Tool Permission Sandbox

The extraction agent has strict tool permission restrictions, defined by createAutoMemCanUseTool:

  • Allowed: FileRead, Grep, Glob (read-only)
  • Allowed: Read-only Bash commands (ls, find, cat, stat, etc.)
  • Allowed: FileEdit, FileWrite — but only for paths within the memory directory
  • Denied: All other tools (MCP, Agent, write-capable Bash, etc.)
TypeScript
1// src/services/extractMemories/extractMemories.ts, lines 171-222
2export function createAutoMemCanUseTool(memoryDir: string): CanUseToolFn {
3 return async (tool: Tool, input: Record<string, unknown>) => {
4 // Allow Read/Grep/Glob
5 if (tool.name === FILE_READ_TOOL_NAME ||
6 tool.name === GREP_TOOL_NAME ||
7 tool.name === GLOB_TOOL_NAME) {
8 return { behavior: 'allow' as const, updatedInput: input }
9 }
10
11 // Bash only allows read-only commands
12 if (tool.name === BASH_TOOL_NAME) {
13 const parsed = tool.inputSchema.safeParse(input)
14 if (parsed.success && tool.isReadOnly(parsed.data)) {
15 return { behavior: 'allow' as const, updatedInput: input }
16 }
17 return denyAutoMemTool(tool, 'Only read-only shell commands...')
18 }
19
20 // Write/Edit only allowed for paths inside the memory directory
21 if ((tool.name === FILE_EDIT_TOOL_NAME ||
22 tool.name === FILE_WRITE_TOOL_NAME) &&
23 'file_path' in input) {
24 const filePath = input.file_path
25 if (typeof filePath === 'string' && isAutoMemPath(filePath)) {
26 return { behavior: 'allow' as const, updatedInput: input }
27 }
28 }
29
30 return denyAutoMemTool(tool, `only ... are allowed`)
31 }
32}

Extraction Prompt Design

The prompt received by the extraction agent is built by src/services/extractMemories/prompts.ts. It includes the complete type taxonomy, saving rules, and a key optimization — pre-injecting the existing memory manifest:

TypeScript
1// src/services/extractMemories/prompts.ts, lines 29-44
2function opener(newMessageCount: number, existingMemories: string): string {
3 const manifest =
4 existingMemories.length > 0
5 ? `\n\n## Existing memory files\n\n${existingMemories}\n\n` +
6 `Check this list before writing — update an existing file ` +
7 `rather than creating a duplicate.`
8 : ''
9 return [
10 `You are now acting as the memory extraction subagent. ` +
11 `Analyze the most recent ~${newMessageCount} messages above...`,
12 '',
13 `Available tools: FileRead, Grep, Glob, read-only Bash, ` +
14 `and FileEdit/FileWrite for paths inside the memory directory only.`,
15 '',
16 `You have a limited turn budget. FileEdit requires a prior FileRead, ` +
17 `so the efficient strategy is: turn 1 — issue all FileRead calls in ` +
18 `parallel; turn 2 — issue all FileWrite/FileEdit calls in parallel.`,
19 // ...
20 ].join('\n')
21}

The extraction prompt also includes a strict constraint: "You may only use content from the most recent ~N messages to update memories. Do not spend turns investigating or verifying this content — don't grep source code, don't read code to confirm patterns, don't run git commands."

Concurrency Control and Message Coalescing

The extraction system has sophisticated concurrency control. When an extraction is already in progress, incoming requests are stashed and a "trailing extraction" runs after the current one completes:

sequenceDiagram
    participant U as User Query
    participant M as Main Agent
    participant E as Extraction Agent
    participant FS as Filesystem

    U->>M: Send request
    M->>M: Process and respond
    M-->>E: handleStopHooks triggers extraction

    Note over E: Check hasMemoryWritesSince
    alt Main Agent already wrote memories
        E->>E: Skip, advance cursor
    else Main Agent did not write memories
        E->>E: runForkedAgent starts
        E->>FS: scanMemoryFiles pre-scan
        E->>E: Analyze most recent N messages
        E->>FS: Write memory files in parallel
        E->>M: appendSystemMessage notification
    end

    U->>M: New request arrives (extraction still running)
    M-->>E: stash context
    Note over E: After current extraction completes
    E->>E: trailing extraction processes stash

Extraction Frequency Throttling

Extraction doesn't run on every turn — the interval is controlled via the feature flag tengu_bramble_lintel (default: every 1 eligible turn):

TypeScript
1// src/services/extractMemories/extractMemories.ts, lines 377-385
2if (!isTrailingRun) {
3 turnsSinceLastExtraction++
4 if (
5 turnsSinceLastExtraction <
6 (getFeatureValue_CACHED_MAY_BE_STALE('tengu_bramble_lintel', null) ?? 1)
7 ) {
8 return
9 }
10}
11turnsSinceLastExtraction = 0

Memory Injection Timing

Memories are loaded into the conversation context through two paths:

Path One: System Prompt Injection (MEMORY.md Index)

loadMemoryPrompt() is called during system prompt construction, injecting the content of MEMORY.md (after truncation processing) into the system prompt. This is the first layer of memory loading at session startup:

TypeScript
1// src/memdir/memdir.ts, lines 419-507
2export async function loadMemoryPrompt(): Promise<string | null> {
3 const autoEnabled = isAutoMemoryEnabled()
4
5 // KAIROS log mode takes priority
6 if (feature('KAIROS') && autoEnabled && getKairosActive()) {
7 return buildAssistantDailyLogPrompt(skipIndex)
8 }
9
10 // TEAMMEM mode: load both private and team memories
11 if (feature('TEAMMEM')) {
12 if (teamMemPaths!.isTeamMemoryEnabled()) {
13 const autoDir = getAutoMemPath()
14 const teamDir = teamMemPaths!.getTeamMemPath()
15 await ensureMemoryDirExists(teamDir)
16 return teamMemPrompts!.buildCombinedMemoryPrompt(extraGuidelines, skipIndex)
17 }
18 }
19
20 // Standard mode: load personal memories only
21 if (autoEnabled) {
22 const autoDir = getAutoMemPath()
23 await ensureMemoryDirExists(autoDir)
24 return buildMemoryLines('auto memory', autoDir, extraGuidelines, skipIndex)
25 .join('\n')
26 }
27
28 return null
29}

Path Two: Relevant Memory Prefetch (Individual Memory Files)

The MEMORY.md index is always loaded, but individual memory file contents are not all loaded — that would waste context. Instead, the system selectively prefetches the most relevant memories based on the user's current query.

This process is driven by startRelevantMemoryPrefetch():

TypeScript
1// src/utils/attachments.ts, lines 2361-2424
2export function startRelevantMemoryPrefetch(
3 messages: ReadonlyArray<Message>,
4 toolUseContext: ToolUseContext,
5): MemoryPrefetch | undefined {
6 if (!isAutoMemoryEnabled() || !getFeatureValue_CACHED_MAY_BE_STALE(...)) {
7 return undefined
8 }
9
10 const lastUserMessage = messages.findLast(m => m.type === 'user' && !m.isMeta)
11 if (!lastUserMessage) {
12 return undefined
13 }
14
15 const input = getUserMessageText(lastUserMessage)
16 // Single-word queries lack sufficient context
17 if (!input || !/\s/.test(input.trim())) {
18 return undefined
19 }
20
21 const surfaced = collectSurfacedMemories(messages)
22 if (surfaced.totalBytes >= RELEVANT_MEMORIES_CONFIG.MAX_SESSION_BYTES) {
23 return undefined
24 }
25
26 // Async prefetch, non-blocking to the main query
27 const promise = getRelevantMemoryAttachments(
28 input,
29 toolUseContext.options.agentDefinitions.activeAgents,
30 toolUseContext.readFileState,
31 collectRecentSuccessfulTools(messages, lastUserMessage),
32 controller.signal,
33 surfaced.paths,
34 )
35 // ...
36}

Key design decisions in the prefetch:

  1. Non-blocking: The prefetch is asynchronous, never blocking the main query loop
  2. Cancellable: Linked to a turn-level AbortController, so the user can cancel immediately by pressing Escape
  3. Disposable pattern: Uses the using keyword binding, automatically cleaning up on all exit paths of the query loop (return, throw, .return())
  4. Session-level byte cap: Prevents unlimited memory injection in long sessions

findRelevantMemories: AI-Driven Memory Retrieval

Memory file selection doesn't rely on keyword matching — it uses a Sonnet model via sideQuery to determine which memories are most relevant to the current query:

TypeScript
1// src/memdir/findRelevantMemories.ts, lines 39-75
2export async function findRelevantMemories(
3 query: string,
4 memoryDir: string,
5 signal: AbortSignal,
6 recentTools: readonly string[] = [],
7 alreadySurfaced: ReadonlySet<string> = new Set(),
8): Promise<RelevantMemory[]> {
9 const memories = (await scanMemoryFiles(memoryDir, signal)).filter(
10 m => !alreadySurfaced.has(m.filePath),
11 )
12 if (memories.length === 0) {
13 return []
14 }
15
16 const selectedFilenames = await selectRelevantMemories(
17 query, memories, signal, recentTools,
18 )
19 // ...
20 return selected.map(m => ({ path: m.filePath, mtimeMs: m.mtimeMs }))
21}

The selector's system prompt is precise:

TypeScript
1// src/memdir/findRelevantMemories.ts, lines 18-24
2const SELECT_MEMORIES_SYSTEM_PROMPT = `You are selecting memories that will be
3useful to Claude Code as it processes a user's query. You will be given the
4user's query and a list of available memory files with their filenames and
5descriptions.
6
7Return a list of filenames for the memories that will clearly be useful
8(up to 5). Only include memories that you are certain will be helpful...`

The selector also receives a "recently successfully used tools" list, excluding reference docs for tools already in use (since that's noise), but keeping warnings and known issues about those tools (since those are exactly what's needed during use).

Memory Deduplication

There is a deduplication step during memory injection — preventing memories the model has already read from being injected again. This is implemented by filterDuplicateMemoryAttachments():

TypeScript
1// src/utils/attachments.ts, lines 2520-2541
2export function filterDuplicateMemoryAttachments(
3 attachments: Attachment[],
4 readFileState: FileStateCache,
5): Attachment[] {
6 return attachments
7 .map(attachment => {
8 if (attachment.type !== 'relevant_memories') return attachment
9 const filtered = attachment.memories.filter(
10 m => !readFileState.has(m.path),
11 )
12 for (const m of filtered) {
13 readFileState.set(m.path, {
14 content: m.content,
15 timestamp: m.mtimeMs,
16 offset: undefined,
17 limit: m.limit,
18 })
19 }
20 return filtered.length > 0 ? { ...attachment, memories: filtered } : null
21 })
22 .filter((a): a is Attachment => a !== null)
23}

A source code comment specifically mentions a subtle bug fix here:

The mark-after-filter ordering is load-bearing: readMemoriesForSurfacing used to write to readFileState during the prefetch, which meant the filter saw every prefetch-selected path as "already in context" and dropped them all (self-referential filter).

The previous implementation wrote to readFileState during the prefetch phase, so when the filter checked, it found all prefetched memories were "already in context" — it filtered out itself. The fix was to defer the write until after filtering.

Memory Expiration and Update Strategy

Time Awareness

src/memdir/memoryAge.ts provides human-readable time annotations:

TypeScript
1// src/memdir/memoryAge.ts, lines 6-19
2export function memoryAgeDays(mtimeMs: number): number {
3 return Math.max(0, Math.floor((Date.now() - mtimeMs) / 86_400_000))
4}
5
6export function memoryAge(mtimeMs: number): string {
7 const d = memoryAgeDays(mtimeMs)
8 if (d === 0) return 'today'
9 if (d === 1) return 'yesterday'
10 return `${d} days ago`
11}

Why convert timestamps to "47 days ago" instead of ISO format? Because models perform poorly at date arithmetic — seeing 2026-02-12T08:33:00Z doesn't automatically trigger the realization "this is from a long time ago," but seeing "47 days ago" immediately triggers staleness reasoning.

Staleness Warning Injection

Memories older than 1 day are annotated with a staleness warning:

TypeScript
1// src/memdir/memoryAge.ts, lines 33-42
2export function memoryFreshnessText(mtimeMs: number): string {
3 const d = memoryAgeDays(mtimeMs)
4 if (d <= 1) return ''
5 return (
6 `This memory is ${d} days old. ` +
7 `Memories are point-in-time observations, not live state — ` +
8 `claims about code behavior or file:line citations may be outdated. ` +
9 `Verify against current code before asserting as fact.`
10 )
11}

The motivation for this warning came from user reports: stale code state memories (containing file:line references) were being asserted as facts, and the references made stale claims appear more authoritative rather than less reliable.

Verify Before Asserting

The TRUSTING_RECALL_SECTION in the system prompt requires the model to verify before recommending based on memories:

TypeScript
1// src/memdir/memoryTypes.ts, lines 240-256
2export const TRUSTING_RECALL_SECTION: readonly string[] = [
3 '## Before recommending from memory',
4 '',
5 'A memory that names a specific function, file, or flag is a claim that ' +
6 'it existed *when the memory was written*. It may have been renamed, ' +
7 'removed, or never merged. Before recommending it:',
8 '',
9 '- If the memory names a file path: check the file exists.',
10 '- If the memory names a function or flag: grep for it.',
11 '- If the user is about to act on your recommendation: verify first.',
12 '',
13 '"The memory says X exists" is not the same as "X exists now."',
14]

A comment documents the eval validation results: when this text was renamed from "Trusting what you recall" to "Before recommending from memory," the eval went from 0/3 to 3/3 — title wording affected the model's behavioral triggers.

Team Memory Sync

When the TEAMMEM feature flag is enabled, the memory system expands to a dual-directory structure:

Text
1~/.claude/projects/<project>/memory/
2 MEMORY.md # Private index
3 user_role.md # Private memory
4 feedback_terse.md # Private memory
5 team/ # Team shared
6 MEMORY.md # Team index
7 feedback_no_mocks.md # Team memory
8 project_freeze.md # Team memory

Team Path

The team memory directory is a subdirectory of the personal memory directory:

TypeScript
1// src/memdir/teamMemPaths.ts, lines 84-86
2export function getTeamMemPath(): string {
3 return (join(getAutoMemPath(), 'team') + sep).normalize('NFC')
4}

Dual-Directory Prompt

When team memory is enabled, the prompt includes instructions for both directories, with each memory type annotated with a <scope> tag to guide placement:

  • user type: Always private (your personal profile shouldn't be shared)
  • feedback type: Private by default, unless it's clearly a project-level convention (e.g., testing strategy)
  • project type: Leans toward team
  • reference type: Usually team

Sync Mechanism

src/services/teamMemorySync/ implements the full sync mechanism:

sequenceDiagram
    participant L as Local Filesystem
    participant W as Watcher
    participant S as Server API

    Note over W: Session startup
    W->>S: GET /api/claude_code/team_memory
    S-->>W: TeamMemoryData (entries + checksums)
    W->>L: Write local files (server wins)

    Note over W: Start fs.watch (recursive)

    L->>W: File change event
    W->>W: debounce 2s
    W->>L: Read changed files
    W->>S: PUT /api/claude_code/team_memory (delta upload)

    Note over W: 412 Precondition Failed
    W->>S: GET ?view=hashes (lightweight probe)
    W->>W: Recompute delta
    W->>S: Retry PUT

Sync semantics:

  • Pull: Server content overwrites local files by key (server wins)
  • Push: Only uploads keys whose content hashes differ from the server (delta upload). The server uses upsert semantics — keys not present in the PUT are preserved
  • Deletes don't propagate: Deleting a file locally won't delete it from the server; it will be restored on the next pull

File Monitoring

The watcher uses fs.watch({ recursive: true }):

TypeScript
1// src/services/teamMemorySync/watcher.ts, lines 167-228
2async function startFileWatcher(teamDir: string): Promise<void> {
3 // ...
4 watcher = watch(
5 teamDir,
6 { persistent: true, recursive: true },
7 (_eventType, filename) => {
8 if (pushSuppressedReason !== null) {
9 // Only unlink can clear suppression
10 void stat(join(teamDir, filename)).catch((err) => {
11 if (err.code !== 'ENOENT') return
12 pushSuppressedReason = null
13 schedulePush()
14 })
15 return
16 }
17 schedulePush()
18 },
19 )
20}

Why not use chokidar? A comment explains: chokidar 4+ removed fsevents support, and Bun's fs.watch fallback uses kqueue — each monitored file requires a file descriptor. With 500+ team memory files, that's 500+ permanently held file descriptors. recursive: true uses FSEvents on macOS (O(1) fd) and inotify on Linux (O(number of subdirectories)).

Security Safeguards

Team memory involves cross-user sharing, so security is critical. teamMemPaths.ts implements multiple layers of path safety checks:

  1. Path injection protection: sanitizePathKey() checks for null bytes, URL-encoded traversal (%2e%2e%2f), Unicode normalization attacks, backslashes, and absolute paths
  2. Symlink protection: realpathDeepestExisting() resolves symlinks to real paths, preventing escape out of the team directory via symlinks
  3. Dangling symlink detection: Uses lstat to distinguish "truly doesn't exist" from "symlink target doesn't exist"
  4. Secret scanning: scanForSecrets() uses gitleaks rules to detect API keys, credentials, and other sensitive data, blocking pushes

Permanent Failure Suppression

When a push fails for an unrecoverable reason (no OAuth, 404, 413, etc.), the watcher suppresses subsequent retries, avoiding infinite retry loops. There was a case where a device without OAuth generated 167,000 push events over 2.5 days.

TypeScript
1// src/services/teamMemorySync/watcher.ts, lines 61-73
2export function isPermanentFailure(r: TeamMemorySyncPushResult): boolean {
3 if (r.errorType === 'no_oauth' || r.errorType === 'no_repo') return true
4 if (
5 r.httpStatus !== undefined &&
6 r.httpStatus >= 400 &&
7 r.httpStatus < 500 &&
8 r.httpStatus !== 409 && // 409 is a transient conflict
9 r.httpStatus !== 429 // 429 is rate limiting
10 ) {
11 return true
12 }
13 return false
14}

Complete Memory System Lifecycle

Write Path
User conversation
Main Agent writes
extractMemories
auto extraction
memory/ directory
Read Path
Session startup
loadMemoryPrompt
loads MEMORY.md
User query
startRelevantMemoryPrefetch
findRelevantMemories
Sonnet sideQuery
filterDuplicateMemoryAttachments
Inject into context
Sync Path
team memory watcher
pushTeamMemory
delta upload
Server API

Portable Patterns

The memory system's design has an implicit but important property: portability.

Because all memories are standard Markdown files, stored at deterministic paths on the filesystem, with a uniform frontmatter format:

  1. Cross-device migration: Copy the ~/.claude/projects/ directory to migrate all memories
  2. Version control: The memory directory can be put under git management (though this isn't done by default)
  3. Backup and restore: Standard filesystem backup tools work out of the box
  4. Bulk editing: Any text editor can directly modify memories
  5. Programmatic operations: Scripts can directly read and write files in the frontmatter format
  6. Cross-tool compatibility: Other tools can read and understand this format

No proprietary database format, no encrypted blobs, no storage that requires a specific API to access. This is a deliberate design choice — it trades some query efficiency (compared to SQLite) for transparency and operability.

Path Safety and Configurability

The configurability of the path system is also worth noting. validateMemoryPath() in paths.ts performs strict security validation on paths:

TypeScript
1// src/memdir/paths.ts, lines 109-150
2function validateMemoryPath(
3 raw: string | undefined,
4 expandTilde: boolean,
5): string | undefined {
6 if (!raw) return undefined
7 let candidate = raw
8 if (expandTilde && (candidate.startsWith('~/') || candidate.startsWith('~\\'))) {
9 const rest = candidate.slice(2)
10 const restNorm = normalize(rest || '.')
11 if (restNorm === '.' || restNorm === '..') {
12 return undefined // Reject expansion to $HOME or its parent
13 }
14 candidate = join(homedir(), rest)
15 }
16 const normalized = normalize(candidate).replace(/[/\\]+$/, '')
17 if (
18 !isAbsolute(normalized) ||
19 normalized.length < 3 ||
20 /^[A-Za-z]:$/.test(normalized) || // Windows root drive
21 normalized.startsWith('\\\\') || // UNC paths
22 normalized.startsWith('//') ||
23 normalized.includes('\0') // null bytes
24 ) {
25 return undefined
26 }
27 return (normalized + sep).normalize('NFC')
28}

A security comment specifically notes: projectSettings (committed to the repo in .claude/settings.json) is deliberately excluded — a malicious repository could set autoMemoryDirectory: "~/.ssh" to gain write access to a sensitive directory. Only policySettings, localSettings, and userSettings from trusted sources are accepted.

Special Mode: KAIROS Assistant Logs

When feature('KAIROS') is enabled and running in assistant mode, the memory system switches to a log mode. Assistant sessions are effectively long-running, so instead of maintaining a MEMORY.md index, the agent writes to date-named log files in append mode:

Text
1~/.claude/projects/<project>/memory/logs/2026/03/2026-03-31.md

Each log entry is a brief timestamped bullet point. The MEMORY.md index is generated by a separate /dream skill that distills from the logs overnight.

The design motivation for this mode is: in long-running sessions, the cost of maintaining an index in real time is too high, and logs are naturally ordered by time, so they don't need an index for organization. The distillation process can run during low-load periods, performing deeper organization.

Disabling Memory

The memory system can be disabled at multiple levels:

TypeScript
1// src/memdir/paths.ts, lines 30-55
2export function isAutoMemoryEnabled(): boolean {
3 const envVal = process.env.CLAUDE_CODE_DISABLE_AUTO_MEMORY
4 if (isEnvTruthy(envVal)) return false // Environment variable disables
5 if (isEnvDefinedFalsy(envVal)) return true // Environment variable explicitly enables
6 if (isEnvTruthy(process.env.CLAUDE_CODE_SIMPLE)) return false // --bare mode
7 if (isEnvTruthy(process.env.CLAUDE_CODE_REMOTE) &&
8 !process.env.CLAUDE_CODE_REMOTE_MEMORY_DIR) return false // Remote without storage
9 const settings = getInitialSettings()
10 if (settings.autoMemoryEnabled !== undefined) {
11 return settings.autoMemoryEnabled
12 }
13 return true // Enabled by default
14}

Priority chain: Environment variable > --bare mode > Remote mode detection > settings.json > Enabled by default.

Design Insights

Claude Code's memory system has several design decisions worth reflecting on:

Filesystem as database. No SQLite, LevelDB, or any embedded database — just the filesystem directly. This may seem "primitive," but it delivers debuggability (just cat to inspect), portability (just copy the directory), and operability (any editor can modify it). For a system where memory entries typically don't exceed 200 and each file is no more than a few KB, filesystem performance is more than sufficient.

Closed type taxonomy. Only four types, each with clear guidance on "what to save and what not to save." This prevents the model's tendency to save everything as a memory — particularly the rule "content derivable from code should not be saved as memory" effectively prevents memory bloat.

AI-driven retrieval. Memory retrieval doesn't rely on keyword matching or vector search, but instead directly lets another AI (Sonnet) look at frontmatter descriptions to judge relevance. This is very effective when the memory count is small (< 200) — each memory's description is semantically rich natural language, and the AI can make more accurate judgments than keyword matching.

Eval-driven prompt iteration. Code comments repeatedly reference eval results to explain specific wording choices — for example, changing a section title from "Trusting what you recall" to "Before recommending from memory" improved the eval from 0/3 to 3/3. This demonstrates that the memory system's behavior is largely determined by prompt engineering, and prompt wording choices require quantitative validation.

Mutually exclusive write paths. The mutual exclusion design between the main agent and extraction agent avoids duplicate memories, but also means that if the main agent writes one memory during a conversation, even if it missed other memorable content in the same conversation, the extraction agent won't fill in the gaps — it considers the entire range as already processed. This is an intentional trade-off: between redundancy and omission, it chose omission.

How does this system perform in practice? Judging from the telemetry events and eval references scattered throughout the source code, it has gone through extensive experimentation and iteration. Memory is not just an engineering problem but a product design problem — what to remember, what not to remember, when to surface, and when to stay silent. These decisions directly impact user experience. Claude Code's memory system provides a battle-tested answer.