datashaman · datashaman · May 2, 2026 · May 2, 2026 · May 2, 2026 · May 2, 2026
diff --git a/skills/harness/SKILL.md b/skills/harness/SKILL.md
@@ -14,11 +14,15 @@ description: >
   of ~/.claude/ to a private git repo); status (report what's installed,
   modified, or missing); audit (prepare a monthly remote-audit routine
   that PRs deltas against the latest Anthropic releases and Claude Code
-  community patterns). All sub-actions are idempotent. Use when asked to
-  "set up my Claude Code", "install harness", "uninstall harness", "update
-  harness", "diagnose my setup", "adopt harness in this project",
-  "retrofit", "snapshot my setup", "audit my setup", "harden my Claude",
-  or any request matching the sub-actions.
+  community patterns); memoize (deterministic memory hygiene pass —
+  index sync, frontmatter, stale citations, lexical duplicates — emits
+  a stable report; pairs with a weekly /schedule routine). All
+  sub-actions are idempotent. Use when asked to "set up my Claude
+  Code", "install harness", "uninstall harness", "update harness",
+  "diagnose my setup", "adopt harness in this project", "retrofit",
+  "snapshot my setup", "audit my setup", "harden my Claude",
+  "memoize", "consolidate memory", "prune memory", or any request
+  matching the sub-actions.
 user-invocable: true
 ---
 
@@ -42,6 +46,7 @@ The user invokes this skill, optionally with an action word. Detect intent and r
 | "snapshot", "backup", "mirror to git"         | `snapshot`  | `scripts/snapshot.sh`           |
 | "status", "what's installed", "audit local"   | `status`    | `scripts/status.sh`             |
 | "audit", "schedule audit", "monthly check"    | `audit`     | (prep work — see below)         |
+| "memoize", "consolidate memory", "prune memory" | `memoize`   | `scripts/memoize.sh`            |
 
 Substitute the skill's absolute base directory for `$SKILL_DIR` in every command — it's announced at the top of this invocation.
 
@@ -239,6 +244,40 @@ Steps:
 
 The remote agent clones the snapshot repo, researches the last ~30 days of Anthropic releases and canonical Claude Code voices, and PRs `audits/YYYY-MM-DD-setup-audit.md` with prioritised deltas. It never modifies tracked files outside `audits/`.
 
+## memoize
+
+```bash
+bash "$SKILL_DIR/scripts/memoize.sh"
+```
+
+Proactive memory hygiene for `~/.claude/projects/<slug>/memory/`. Memory is reactive — entries get written when the agent notices something worth saving, but nothing prunes or consolidates. `memoize` is the deterministic maintenance pass.
+
+What it checks:
+
+1. **Index sync** — every `memory/*.md` is listed in `MEMORY.md`; every `MEMORY.md` entry points at a file that exists.
+2. **Frontmatter hygiene** — every memory has the required `name`, `description`, `type`.
+3. **Stale citations** — path-shaped tokens (anything starting with `~/`, `/Users/`, `./`, etc., or ending in a known source extension) that resolve nowhere across `~/.claude/projects/` and `~/Projects/`. Conservative on purpose — false positives cost more than misses.
+4. **Possible duplicates** — pairs of memories of the same `type` whose `name` or `description` are lexically similar (Jaccard ≥ 0.5). Flag, don't merge.
+
+Output: a single file at `<memory>/_memoize-report.md`. The leading underscore is the contract — `MEMORY.md` indexing rules and the remote routine both ignore `_*.md`, so the report itself never gets treated as a memory entry. The report is byte-stable on equal runs (two consecutive invocations produce an identical file).
+
+Flags:
+- `--dry-run` — print the plan + report preview, write nothing.
+- `--target=PATH` — explicit memory dir.
+
+Env knobs (mirror `snapshot.sh`):
+- `CLAUDE_DIR` — root of the Claude Code config dir. Search-root defaults track this, so a custom `CLAUDE_DIR` cascades correctly.
+- `USER_PROJECT_KEY` — the slug under `<CLAUDE_DIR>/projects/`.
+- `MEMOIZE_SEARCH_ROOTS` — colon-separated (PATH-style, supports paths with spaces) list of roots to resolve stale citations against. Defaults to `<CLAUDE_DIR>/projects:$HOME/Projects`.
+
+**Scheduled routine.** For the conceptual drift the lexical script can't see (semantic duplicates, outdated facts, conflicting guidance), wire a weekly `/schedule` job using `scripts/memoize-prompt.md`. Suggested config:
+- cron: `0 6 * * 0` (Sunday 06:00 UTC)
+- model: `claude-opus-4-7`
+- tools: Bash, Read, Write, Edit, Glob, Grep, Agent
+- source: the user's snapshot repo (run `harness snapshot` first)
+
+Scope: the snapshot repo doesn't mirror harness scripts or local search roots, so the remote agent does its own in-process structural pass (index sync, frontmatter) and adds the semantic checks. Stale-citation analysis stays local-only — the search roots aren't available remotely. The remote agent PRs `audits/memory/YYYY-MM-DD.md` with proposed edits and never modifies any memory entry.
+
 ## Constraints
 
 - **Never auto-fill stack signals or user_role.** Templates have placeholders; ask the user to fill them.
@@ -267,3 +306,5 @@ The remote agent clones the snapshot repo, researches the last ~30 days of Anthr
 | `scripts/_detect_stack.py`        | Stack-signal detector — auto-fills `## Stack signals` at install time |
 | `assets/harness-check.sh.tmpl`    | Starter project pass/fail gate written by `adopt`                     |
 | `scripts/audit-prompt.md`         | Prompt template for the monthly remote-audit routine                  |
+| `scripts/memoize.sh`              | Memory consolidation pass — deterministic; emits `_memoize-report.md` |
+| `scripts/memoize-prompt.md`       | Prompt template for the weekly remote-memoize routine                 |
diff --git a/skills/harness/scripts/memoize-prompt.md b/skills/harness/scripts/memoize-prompt.md
@@ -0,0 +1,109 @@
+# Weekly memory consolidation — agent prompt
+
+Use this prompt as the `events[].data.message.content` when creating a remote
+routine that consolidates `~/.claude/projects/<slug>/memory/`. The routine
+clones the snapshot repo (e.g. `<you>/claude-setup`) and needs Bash, Read,
+Write, Edit, Glob, Grep, Agent.
+
+Suggested config:
+- cron: `0 6 * * 0` (Sunday 06:00 UTC)
+- model: `claude-opus-4-7`
+- tools: Bash, Read, Write, Edit, Glob, Grep, Agent
+- sources: the user's snapshot repo (must exist — run `harness snapshot` first)
+
+## Scope note
+
+The local `memoize.sh` script does four checks: index sync, frontmatter
+hygiene, stale citations, and lexical duplicates. Of these, **stale
+citations cannot run remotely** — the search roots (`~/.claude/projects`,
+`~/Projects`) are local-machine state that the snapshot repo does not
+mirror. The remote routine focuses on the **conceptual** drift the local
+script can't see (semantic duplicates, outdated facts, conflicts), and
+re-runs the cheap structural checks (index, frontmatter) directly against
+the snapshot.
+
+---
+
+You are doing the weekly memory hygiene pass against a snapshot of my Claude
+Code memory. The repo you are running in is a sanitised mirror of `~/.claude/`
+(see README.md). Memory lives in `memory/`, indexed by `memory/MEMORY.md`.
+The snapshot does **not** contain harness scripts or local search roots.
+
+## Your task
+
+1. **Structural checks (in-process).** For files under `memory/`, applying
+   `memory/MEMORY.md` as the index:
+   - **Index sync** — every `memory/*.md` whose filename does NOT start with
+     `_` and is NOT `MEMORY.md` itself must appear as an entry in
+     `MEMORY.md`. Every `MEMORY.md` entry must point at a real file.
+   - **Frontmatter hygiene** — every memory file must start with a `---`
+     YAML frontmatter block containing non-empty `name`, `description`,
+     and `type` fields.
+
+2. **Read every memory.** For each `memory/*.md` file (skipping `MEMORY.md`
+   and any `_*.md` such as the consolidation report itself), read it and
+   form a one-line gist. Group by `type`.
+
+3. **Conceptual checks the structural pass can't catch:**
+   - **Conceptual duplicates** that don't share vocabulary — two `feedback`
+     memories saying the same thing in different words.
+   - **Outdated facts** — `project` memories citing deadlines, milestones,
+     stakeholders that have moved on. Cross-check against the snapshot's
+     `CLAUDE.md` and other context for currency.
+   - **Conflicting guidance** — two memories that pull in opposite
+     directions.
+   - **Index drift** — entries in `MEMORY.md` whose one-line hook no
+     longer reflects the file's body.
+
+4. **Write the report** to `audits/memory/YYYY-MM-DD.md`. Structure:
+
+   ```markdown
+   # Memory consolidation — {{date}}
+
+   ## TL;DR
+   3-5 bullets — the highest-leverage merges, deletes, or rewrites.
+
+   ## Structural findings
+   Index sync + frontmatter hygiene issues. Per-file bullets.
+
+   ## Conceptual duplicates
+   Pairs of memories that say the same thing differently. Propose a merge
+   target and the surviving content.
+
+   ## Outdated facts
+   Memories whose body cites stale state. Quote the line, suggest the edit.
+
+   ## Conflicts
+   Memories pulling in opposite directions. Surface the contradiction; do
+   not resolve it unilaterally.
+
+   ## Proposed edits (ordered by leverage)
+   - **What:** one-line summary
+   - **Where:** `memory/<file>.md`
+   - **Diff:** old → new (or "merge X into Y, delete X")
+   - **Why:** the signal in the body that motivates this
+
+   ## Skip-list
+   Things that looked relevant but aren't worth doing this week.
+   ```
+
+5. **Open a PR.** Branch `memoize/YYYY-MM-DD`, commit
+   `memoize: weekly memory consolidation YYYY-MM-DD`, PR title
+   `Memory consolidation — {{date}}`, PR body = TL;DR. Use `gh pr create`.
+
+## Constraints
+
+- DO NOT modify any tracked file outside `audits/memory/`. The user reviews
+  proposed edits and applies them locally — the routine never edits memory.
+- Skip files whose names start with `_` (e.g. `_memoize-report.md` if it
+  ever lands in the snapshot) and `MEMORY.md` when iterating "every memory".
+- Stale-citation analysis is local-only; do not attempt a snapshot
+  equivalent — false positives drown the signal.
+- Conservative on outdated facts. False positives cost more than misses;
+  the user reads every line of this report.
+- Prefer specific over comprehensive. 3 merges I'll do > 30 I won't.
+- If nothing material changed since last week, write a short report saying
+  so. Don't pad.
+- Concise output — the user reads in feedforward / sensors / GC vocabulary.
+
+Report length target: 400-1000 words.