feat(memory): structured messages + opt-in dedupBySession; add greymemory-cc plugin#8
Merged
Merged
Conversation
…mory-cc plugin
Library (src/):
- Message now accepts a 'tool' role, tool_calls/tool_call_id/name, and content
as string | ContentBlock[]. A normalization chokepoint in add()
(_normalizeContent / _serializeToolCalls / _normalizeMessages) flattens
everything to { role, content:string }: content blocks -> text + [image]
placeholder, assistant tool_calls -> [tool_call name=.. args=..], tool results
-> [tool result name=..]. source_role widened to Role | null.
- dedupBySession: opt-in add() option (requires sessionId). Each round's RAW text
is sha256-hashed before contextualization; rounds already ingested under the
same sessionId in this container are skipped (no chunk, no contextualization,
no extraction). New chunks.content_hash + partial idx_chunks_dedup across
_init / _migrate (rebuild + ALTER fallback) / bin/migrate.js; storage.chunkExists();
AddResult.roundsSkipped / chunksSkipped. Session-mode extraction is rebuilt from
surviving rounds with re-indexed provenance. Off by default -> byte-for-byte
back-compat for existing string / {role,content:string}[] callers.
- index.d.ts updated (Role, ContentBlock, ToolCall, Message, AddOptions, AddResult).
Tests (test files/): test-task-3-dedup.js (dedup: skip counts, opt-in gating,
contextualRetrieval, session-mode survivor prompt, container isolation),
test-task-4-structured.js (content blocks, image-only, tool_calls, tool source_role,
plain back-compat). tsc --noEmit clean; migration paths verified on simulated old DBs.
Plugin (greymemory-cc/): Claude Code plugin that captures sessions into a
self-hosted greymemory DB (Stop hook -> structured transcript mapping ->
add({ sessionId, dedupBySession })) and injects memories (SessionStart +
UserPromptSubmit -> hookSpecificOutput.additionalContext), plus a stdio MCP server
(grey_search / grey_add / grey_profile) and slash commands. DB opened with WAL +
busy_timeout for multi-process safety. Verified live against Ollama
(mxbai-embed-large) + Anthropic (claude-haiku-4-5).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… seam, CI - Tier 1 unit (test/plugin.test.mjs): pure-logic tests for transcript mapping, cursor watermark, and container resolution. Zero deps (node built-ins only). - Tier 2 integration (test/integration.test.mjs): spawns the real capture-worker, retrieve hook, and MCP server with CC-shaped payloads, verifying DB writes, structured tool mapping, retrieval injection, MCP tools, and dedup — fully offline. - lib/memory.mjs: add GREYMEMORY_EXTRACTOR=stub and GREYMEMORY_EMBEDDER=stub providers (deterministic, no network), gated behind explicit env; defaults (anthropic / ollama) unchanged. Enables offline tests and fully-local runs. - package.json: `npm test` (unit) + `npm run test:integration`. - .github/workflows/ci.yml: tsc --noEmit + library feature tests + plugin unit + offline integration, on push/PR. No secrets required. - README: Testing section (three tiers + provider knobs). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Pivot transcript mapping from structured messages to a clean prose round stream. Fix ide-context prompt loss (strip <ide_*> BEFORE the injected-row test, so a prompt typed with IDE context attached is no longer dropped wholesale). Move readJsonl to io.mjs so transcript.mjs is a pure entries->messages transform. Add lib/config.mjs: captureTools is user opt-in (settings.json / GREYMEMORY_CAPTURE_TOOLS), default off (conversational). Update unit + integration suites and README to match. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Merge the former greymemory-viz and greymemory-diag into a single greymemory-console (one server + one client with viz/ and diag/ surfaces). Add benchmark helper scripts (ingest-single, verify-task-1, test-task-1-1) and CP1/CP2 round/key test files; refresh CLAUDE.md/benchmark/run.js. (Pre-existing working-tree changes, committed as-is per request; greymemory-cc work is in the preceding commit.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements the two greymemory library features needed to back a Claude Code memory plugin, and adds that plugin (
greymemory-cc/). Both library features are additive and opt-in — existing string /{role,content:string}[]callers behave byte-for-byte as before.Library — Feature 2: structured messages (
src/memory.js,src/index.d.ts)Messagenow accepts atoolrole,tool_calls/tool_call_id/name, andcontentasstring | ContentBlock[].add()(_normalizeContent/_serializeToolCalls/_normalizeMessages) flattens everything to{ role, content:string }: content blocks → text +[image]placeholder (never the URL), assistanttool_calls→[tool_call name=.. args=..], tool results →[tool result name=..].source_rolewidened toRole | null.Library — Feature 1:
dedupBySession(src/memory.js,src/storage.js,bin/migrate.js,src/index.d.ts)add(input, { sessionId, dedupBySession: true }). Each round's raw text is sha256-hashed before contextualization; rounds already ingested under the samesessionId+container are skipped (no chunk, no contextualization, no extraction).chunks.content_hash+ partialidx_chunks_dedup, kept in lockstep across_init,_migrate(rebuild + ALTER fallback), andbin/migrate.js; newstorage.chunkExists();AddResult.roundsSkipped/chunksSkipped.Plugin —
greymemory-cc/A Claude Code plugin that captures sessions into a self-hosted greymemory DB (
Stophook → structured transcript mapping →add({ sessionId, dedupBySession })) and injects memories (SessionStart+UserPromptSubmit→hookSpecificOutput.additionalContext), plus a stdio MCP server (grey_search/grey_add/grey_profile) and slash commands. DB opened withjournal_mode=WAL+busy_timeoutfor multi-process safety. Capture runs in a detached worker so theStophook never blocks the turn; a UUID-watermark cursor +dedupBySessionmake re-reading a growing transcript idempotent.Testing
tsc --noEmitclean.test files/test-task-3-dedup.js(skip counts, opt-in gating, contextualRetrieval, session-mode survivor prompt, container isolation) andtest files/test-task-4-structured.js(content blocks, image-only, tool_calls in prompt,toolsource_role, plain back-compat).Message[] | string).Verified live
Ran the plugin end-to-end against Ollama (
mxbai-embed-large) + Anthropic (claude-haiku-4-5): capture produced 4 round-chunks + 8 correctly-typed facts (with[tool result name=Bash]/[tool_call name=Bash]mapping), retrieval injected ranked context, the MCP server answeredtools/list/grey_search/grey_profileover stdio, anddedupBySessionskipped all rounds on re-add (chunks unchanged).🤖 Generated with Claude Code