Skip to content

feat(memory): hygiene tiers + generated MEMORY.md index#90

Closed
shuff57 wants to merge 7 commits into
Doorman11991:masterfrom
shuff57:feat/memory-hygiene
Closed

feat(memory): hygiene tiers + generated MEMORY.md index#90
shuff57 wants to merge 7 commits into
Doorman11991:masterfrom
shuff57:feat/memory-hygiene

Conversation

@shuff57

@shuff57 shuff57 commented Jun 7, 2026

Copy link
Copy Markdown

What

smallcode''s memory retrieval is already strong (FTS5 + BM25, staleness decay, type boosts, dedup). What it lacks is lifecycle and visibility: pruneStale() exists but nothing calls it, entries accumulate forever, and there is no way to see what the agent knows. This PR adds a hygiene layer ON TOP of the existing store — the store remains the single source of truth.

  1. Tiers: memory objects gain tier: hot|archive + last_used_at (backfilled on first run). Retrieval touches last_used_at so actively-used entries never age out.
  2. Hygiene sweeps (src/memory/hygiene.js): hot + unused >60d → archive; archive >90d → deleted (matches existing pruneStale semantics); hot count >20 → oldest 5 archived per pass.
  3. Auto-run: hygiene fires silently (try-catch, never blocks) at the existing session-save points. Manual: /memory hygiene.
  4. Generated .smallcode/MEMORY.md (/memory index, also written by every hygiene run): human-readable, git-diffable view grouped by type, hot before archive. Generated artifact — never authoritative.

Store adapter works against both backends: update() on the budget-aware store, mutate+save on the JSON fallback. No node_modules changes.

Tests

11 new (test/memory_hygiene.test.js): age/cap sweeps, backfill, recent-use survival, index rendering, empty-store no-op, fallback round-trip. Full suite green.

Field-verified: remember → /clear → recall round-trip; MEMORY.md regenerated with correct grouping.

Stacked on #89 — new commits here: f8598c8 + the last_used_at retrieval touch.

🤖 Generated with Claude Code

shuff57 and others added 7 commits June 5, 2026 12:33
Skills following the Claude Code layout (<skill-dir>/<name>/SKILL.md)
or written as plain .md without YAML frontmatter were silently skipped
in the standard skill dirs (.smallcode/skills, ~/.smallcode/skills,
~/.config/smallcode/skills). Both shapes now load; README-style files
(README/CHANGELOG/LICENSE/CONTRIBUTING) are filtered by name.

Fixes Doorman11991#81

Constraint: no warning channel exists in SkillManager, so silent skips had no user-visible signal
Rejected: warn-on-skip only | users following Claude Code conventions expect these layouts to work
Confidence: high
Scope-risk: narrow
Not-tested: fullscreen TUI /skill list rendering (logic shared with classic mode)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Create-mode evolver: deterministic friction extraction from saved
traces (repeated near-duplicate prompts, consecutive tool-retry
loops), LLM judgment routed to the strong tier, and ONE quarantined
skill draft per run written to .smallcode/skills/drafts/.

Drafts never auto-load; /evolve promote <name> moves them live.
Validation gates every write (name format, no frontmatter injection,
trigger rules); name collisions across live+draft+global dirs abort;
every create appends to .smallcode/evolver-audit.jsonl. The per-run
cap is structural — EvolverRun raises on a second create.

Constraint: small models produce noisy judgments, so all fuzzy output passes validate-or-abort before any write
Rejected: plugin delivery | needs TraceRecorder + SkillManager internals unreachable from plugin dirs under binary installs
Confidence: high
Scope-risk: narrow
Directive: keep mechanics LLM-free — judgment stays in the command handler so mechanics remain unit-testable
Not-tested: strong-tier routing with a separately configured SMALLCODE_MODEL_STRONG endpoint

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Field regression: rephrased prompts with filler drift (another/please/new) failed to cluster because stopwords diluted Jaccard below threshold. Real prompts from a live session pinned as a test.
SkillManager now reads only frontmatter on startup (_index Map) and
loads bodies on demand via _loadBody(), cached in skills Map. This
cuts per-turn skill injection from ~60k chars (all bodies) to ~240
chars (compact index) for a typical 30-skill install.

New surface: getIndex() flat list, formatSkillIndex/formatSkillResult
in skill_index_formatter.js, use_skill tool (executor + tools.js).
getSkillContext() injects the index always; auto-matched bodies append
after, subject to the existing 4000-char cap.

Public API (get/list/getAutoSkills/formatForPrompt/add/remove/
promoteDraft/listDrafts) is unchanged — all 335 tests pass.

Rejected: inject all bodies always | O(skills) context cost per turn
Constraint: existing tests must pass unmodified
Confidence: high
Scope-risk: moderate
Not-tested: live use_skill call by real model (requires interactive session)
use_skill was defined in TOOLS but absent from both routers' category whitelists, so the model never saw it in routed mode. The skill index is injected every turn, so the tool rides along in every tool-bearing category (~80 tokens).
Memory objects gain tier (hot|archive) and last_used_at fields
(backward-compat: backfilled on first hygiene run). runHygiene()
sweeps: hot+unused>60d→archive, archive>90d→forget, hot>20→archive
oldest 5. Adapter layer handles both SQLite budget-aware-mcp (via
update()) and fallback MemoryStore (mutate+save) without touching
node_modules.

Auto-runs silently (try-catch) at 3 session-save points. /memory
hygiene and /memory index subcommands added to commands.js. Generated
.smallcode/MEMORY.md is human-readable + git-diffable; never authoritative.

Rejected: markdown-tier replacement | loses FTS5/BM25
Rejected: hybrid two-source write | inconsistency risk
Constraint: do not modify node_modules/budget-aware-mcp
Confidence: high
Scope-risk: narrow
Not-tested: budget-aware-mcp setMeta path (no setMeta exists — update() used instead)
Without this, actively-retrieved old entries age out of the hot tier at 60d — hygiene tier sweeps need real usage signal. Try-catch wrapped; a failed touch never breaks retrieval.
@shuff57 shuff57 closed this Jun 14, 2026
@shuff57 shuff57 deleted the feat/memory-hygiene branch June 14, 2026 19:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant