diff --git a/README.md b/README.md index a531752..5029cb0 100644 --- a/README.md +++ b/README.md @@ -104,19 +104,19 @@ Trello board health audit. Queries five dimensions of card data to identify thro /audit-trello board="My Board" since=2024-07-01 ``` -### `/compile-task` +### `/cscript` Compile one-off task descriptions into reusable, self-contained executables so the LLM doesn't redo deterministic work each time. Picks bash for POSIX shell ops, PowerShell for Windows-native work, or single-file `uv` Python (PEP 723 inline deps) for anything needing libraries. Scripts are registered in an appdata-backed catalogue and dispatched through a `cscript` binary the skill installs to a user-writable directory on `PATH`. From the second invocation onward the agent finds the script via `cscript which`, confirms with you, and runs it directly — no regeneration. -Ships with a `cscript` dispatcher (`list`, `which`, `run`, `show`, `edit`, `rm`, `state-dir`, `where`, `register`), a Windows `cscript.cmd` wrapper, and a smoke-test script. Verified on macOS and Linux; Windows support is implemented (subprocess dispatch + `cscript.cmd` wrapper + PowerShell language option) and awaits validation by a Windows user. +Ships with a `cscript` dispatcher (`list`, `which`, `run`, `show`, `edit`, `rm`, `state-dir`, `where`, `version`, `mine`, `register`), a Windows `cscript.cmd` wrapper, and a smoke-test script. `cscript mine` ranks repeated catalogue misses from a local `which` invocation log so you can spot tasks worth compiling without paying per-prompt hook latency. Verified on macOS and Linux; Windows support is implemented (subprocess dispatch + `cscript.cmd` wrapper + PowerShell language option) and awaits validation by a Windows user. **Arguments:** None — describe the task in natural language and the skill decides whether to compile it, match an existing script, or skip. **Usage:** ``` -/compile-task rename JPGs in this directory by the EXIF date they were shot -/compile-task pull all comments from GitHub PR https://github.com/foo/bar/pull/42 as markdown -/compile-task strip EXIF from every image under a folder +/cscript rename JPGs in this directory by the EXIF date they were shot +/cscript pull all comments from GitHub PR https://github.com/foo/bar/pull/42 as markdown +/cscript strip EXIF from every image under a folder ``` ### `/workflow-advisor` diff --git a/skills/compile-task/README.md b/skills/cscript/README.md similarity index 70% rename from skills/compile-task/README.md rename to skills/cscript/README.md index 83f5007..1fd96fa 100644 --- a/skills/compile-task/README.md +++ b/skills/cscript/README.md @@ -1,4 +1,4 @@ -# compile-task +# cscript Compile one-off LLM tasks into reusable, self-contained executables so you stop paying tokens to redo deterministic work. @@ -29,8 +29,23 @@ cscript edit # open in $EDITOR cscript rm # archive and unregister cscript state-dir # print per-script state dir cscript where # print the data directory +cscript version # print the dispatcher version +cscript mine # surface repeated `which` misses worth compiling ``` +## Surfacing candidates from your history + +Per-prompt hooks are expensive (latency on every turn) and can only catch already-registered scripts — they can never propose new ones. Instead, `cscript` keeps a small local log of every `which` query and lets you mine it on demand: + +``` +cscript mine # tasks you've asked for 2+ times that the catalogue never matched +cscript mine --min 1 # every miss +``` + +The signal is direct: you (or an agent on your behalf) ran `cscript which ""`, the catalogue returned nothing, and that happened more than once. Those are exactly the tasks worth compiling next. + +A smarter miner that reads Claude Code session transcripts and shell history is planned but deliberately deferred until the invocation log has enough real data to validate the algorithm. + ## Design choices - **Self-contained scripts.** No project-local imports, no relative paths. A script written for one repo can run from anywhere. @@ -46,9 +61,9 @@ It won't compile one-off explorations, judgement-laden tasks (refactoring, PR de ## Usage ``` -/compile-task rename JPGs in this directory by the EXIF date they were shot -/compile-task pull all comments from GitHub PR https://github.com/foo/bar/pull/42 as markdown -/compile-task strip EXIF from every image under a folder +/cscript rename JPGs in this directory by the EXIF date they were shot +/cscript pull all comments from GitHub PR https://github.com/foo/bar/pull/42 as markdown +/cscript strip EXIF from every image under a folder ``` Or just describe a task and let the skill decide whether to compile it. diff --git a/skills/compile-task/SKILL.md b/skills/cscript/SKILL.md similarity index 84% rename from skills/compile-task/SKILL.md rename to skills/cscript/SKILL.md index 36817b8..9576a47 100644 --- a/skills/compile-task/SKILL.md +++ b/skills/cscript/SKILL.md @@ -1,20 +1,20 @@ --- -name: compile-task +name: cscript description: | Compile a one-off task description into a reusable, self-contained executable script so the LLM doesn't have to do the same work again. Picks bash for trivial file/shell ops, single-file uv Python (PEP 723) - for anything needing libraries. Registers scripts in an OS-correct - appdata directory and exposes them through a `cscript` dispatcher - installed on PATH. Future invocations match the description against - the index and re-run the existing script instead of regenerating. - Use when the user says "compile this", "make a script for X", - "stash this as a script", "save this so I don't have to ask again", - or describes a task that sounds like it will recur. + for anything needing libraries, or PowerShell for Windows-native work. + Registers scripts in an OS-correct appdata directory and exposes them + through a `cscript` dispatcher installed on PATH. Future invocations + match the description against the index and re-run the existing script + instead of regenerating. Use when the user says "compile this", "make + a script for X", "stash this as a script", "save this so I don't have + to ask again", or describes a task that sounds like it will recur. user-invocable: true --- -# compile-task — turn prompts into reusable executables +# cscript — turn prompts into reusable executables The point of this skill is to **stop paying an LLM to redo deterministic work**. Each time the user describes a task that could be a script, compile it once, register it in the catalogue, and run the registered script from then on. Generation is the exception; execution is the rule. @@ -38,7 +38,9 @@ In order, before anything else: 2. **Check whether the dispatcher is installed and on PATH.** Run `command -v cscript >/dev/null 2>&1`. -3. **If `cscript` is missing, install it.** The source lives at `scripts/cscript` relative to this `SKILL.md`. Resolve the path from wherever this file was loaded; if you cannot find it, ask the user for the skill directory rather than guessing. + If installed, also check it is not stale: compare `cscript version` against the source's version. The source's version is the `VERSION = "..."` line near the top of `scripts/cscript`. If they differ (or the installed binary predates `version` and errors), treat it as missing and re-install it the same way as a fresh install — copying over the existing file. Tell the user you are upgrading their dispatcher. + +3. **If `cscript` is missing (or stale), install it.** The source lives at `scripts/cscript` relative to this `SKILL.md`. Resolve the path from wherever this file was loaded; if you cannot find it, ask the user for the skill directory rather than guessing. Pick a destination directory that is **user-writable and already on `PATH`**: @@ -160,8 +162,23 @@ Subcommands: | `cscript rm ` | Archive the file to `scripts/.archive/` and drop its index entry and state directory. | | `cscript state-dir ` | Print (creating if missing) the per-script state directory under the appdata dir. | | `cscript where` | Print the data directory path. | +| `cscript version` | Print the dispatcher version. | +| `cscript mine` | Rank repeated catalogue misses from the `which` log to surface things worth compiling. | | `cscript register …` | Used by this skill at compile time; not normally hand-invoked. | +## Surfacing candidates from history + +`cscript which` appends every query to a local invocation log (`which.log` under the data dir). `cscript mine` reads it back and ranks queries that have been asked 2+ times but never matched anything in the catalogue — direct "I tried to reuse but couldn't" signal. + +Run it on-demand: + +```sh +cscript mine # repeated misses, default threshold 2 +cscript mine --min 1 # every miss +``` + +When you (the agent) see repeated patterns in your conversation that the user hasn't asked to compile yet, suggest `cscript mine` so they can review what's worth stashing. + ## Regeneration When the user asks to "redo", "rewrite", or "regenerate" an existing script (or when running it reveals a bug): diff --git a/skills/compile-task/references/bash-template.sh b/skills/cscript/references/bash-template.sh similarity index 100% rename from skills/compile-task/references/bash-template.sh rename to skills/cscript/references/bash-template.sh diff --git a/skills/compile-task/references/powershell-template.ps1 b/skills/cscript/references/powershell-template.ps1 similarity index 100% rename from skills/compile-task/references/powershell-template.ps1 rename to skills/cscript/references/powershell-template.ps1 diff --git a/skills/compile-task/references/uv-python-template.py b/skills/cscript/references/uv-python-template.py similarity index 100% rename from skills/compile-task/references/uv-python-template.py rename to skills/cscript/references/uv-python-template.py diff --git a/skills/compile-task/scripts/cscript b/skills/cscript/scripts/cscript similarity index 66% rename from skills/compile-task/scripts/cscript rename to skills/cscript/scripts/cscript index 38558b4..9fa1d35 100755 --- a/skills/compile-task/scripts/cscript +++ b/skills/cscript/scripts/cscript @@ -3,7 +3,7 @@ # requires-python = ">=3.11" # dependencies = ["platformdirs>=4.2"] # /// -"""cscript — dispatcher for scripts created by the compile-task skill. +"""cscript — dispatcher for scripts created by the /cscript skill. Stores generated scripts in an OS-correct appdata directory and exposes them through a small set of subcommands. See `cscript --help` for usage. @@ -17,17 +17,21 @@ import json import os from pathlib import Path import shutil +import string import subprocess import sys from platformdirs import user_data_dir +VERSION = "0.3.0" + APP = "cscript" DATA = Path(os.environ.get("CSCRIPT_DATA_DIR") or user_data_dir(APP)) SCRIPTS = DATA / "scripts" ARCHIVE = SCRIPTS / ".archive" STATE = DATA / "state" INDEX = DATA / "index.json" +WHICH_LOG = DATA / "which.log" EXT = {"bash": ".sh", "python": ".py", "powershell": ".ps1"} @@ -75,7 +79,7 @@ def resolve(name: str) -> dict | None: def cmd_list(_args: argparse.Namespace) -> int: items = load_index() if not items: - print("No scripts registered yet. Use the /compile-task skill to create one.") + print("No scripts registered yet. Use the /cscript skill to create one.") return 0 width = max(len(i["name"]) for i in items) for item in sorted(items, key=lambda x: x["name"]): @@ -84,10 +88,50 @@ def cmd_list(_args: argparse.Namespace) -> int: return 0 +_STOPWORDS = frozenset( + """ + this that what with from have your into onto over under about above below + when where which while would should could does done been being just like + such than then them they those well very also even more most some many + much both either neither + """.split() +) +_MIN_QUERY_TOKEN_LEN = 4 +_PUNCT_TRANS = str.maketrans({c: " " for c in string.punctuation}) + + +def _query_tokens(query: str) -> list[str]: + """Split the query into matchable tokens. + + Drops tokens shorter than `_MIN_QUERY_TOKEN_LEN` and common stopwords so + short noise words ("is", "the", "what") don't substring-match into longer + haystack words ("isolate", "there", "whatever"). + """ + cleaned = query.lower().translate(_PUNCT_TRANS) + return [t for t in cleaned.split() if len(t) >= _MIN_QUERY_TOKEN_LEN and t not in _STOPWORDS] + + +def _log_which(query: str, top_hit: str | None) -> None: + """Append a query to the invocation log. Best-effort; never raises.""" + try: + ensure_dirs() + entry = { + "ts": datetime.now(timezone.utc).isoformat(), + "query": query, + "hit": top_hit, + } + with WHICH_LOG.open("a", encoding="utf-8") as fh: + fh.write(json.dumps(entry) + "\n") + except OSError: + pass + + def cmd_which(args: argparse.Namespace) -> int: items = load_index() - query = args.query.lower() - tokens = [t for t in query.split() if t] + tokens = _query_tokens(args.query) + if not tokens: + _log_which(args.query, None) + return 1 scored: list[tuple[int, dict]] = [] for item in items: score = 0 @@ -101,8 +145,10 @@ def cmd_which(args: argparse.Namespace) -> int: if score: scored.append((score, item)) if not scored: + _log_which(args.query, None) return 1 scored.sort(reverse=True, key=lambda x: x[0]) + _log_which(args.query, scored[0][1]["name"]) width = max(len(s[1]["name"]) for s in scored[:5]) for _, item in scored[:5]: print(f" {item['name']:<{width}} {item['description']}") @@ -260,6 +306,91 @@ def cmd_where(_args: argparse.Namespace) -> int: return 0 +def cmd_version(_args: argparse.Namespace) -> int: + print(VERSION) + return 0 + + +def _load_which_log() -> list[dict]: + if not WHICH_LOG.exists(): + return [] + entries: list[dict] = [] + for line in WHICH_LOG.read_text(encoding="utf-8").splitlines(): + line = line.strip() + if not line: + continue + try: + entries.append(json.loads(line)) + except json.JSONDecodeError: + continue + return entries + + +def cmd_mine(args: argparse.Namespace) -> int: + """Surface recurring `which` queries that found nothing — script candidates. + + The signal is: the user (or the skill on their behalf) asked the catalogue + for a script N times with effectively the same prompt, and the catalogue + had nothing. That's a direct intent-to-reuse signal — exactly the thing + we want to compile. + """ + entries = _load_which_log() + if not entries: + print( + "No `cscript which` history yet. Mining looks for repeated catalogue\n" + "misses — keep using cscript and try again later.", + file=sys.stderr, + ) + return 0 + + misses = [e for e in entries if not e.get("hit")] if not args.include_hits else entries + + clusters: dict[str, list[str]] = {} + for entry in misses: + query = entry.get("query") or "" + tokens = _query_tokens(query) + if not tokens: + continue + fingerprint = " ".join(sorted(set(tokens))) + clusters.setdefault(fingerprint, []).append(query) + + ranked = sorted( + ((len(qs), fp, qs) for fp, qs in clusters.items() if len(qs) >= args.min), + reverse=True, + ) + + if not ranked: + msg = ( + f"No patterns repeated {args.min}+ times in the `which` log " + f"({len(misses)} entries scanned).\n" + "Keep using cscript and try again in a week, or pass --min 1 to see every miss." + ) + print(msg, file=sys.stderr) + return 0 + + print(f"Found {len(ranked)} repeated catalogue miss(es):\n") + for count, _fp, samples in ranked[: args.limit]: + unique: list[str] = [] + for s in samples: + if s not in unique: + unique.append(s) + print(f" [{count}x] {unique[0]}") + for sample in unique[1:3]: + print(f" ↳ {sample}") + if len(unique) > 3: + print(f" ↳ ...and {len(unique) - 3} more variant(s)") + print() + sys.stdout.flush() + + print( + "These are tasks you've asked for that the catalogue couldn't fulfil.\n" + "Use `/cscript ` (or describe one to an agent that knows this skill)\n" + "to compile any of them.", + file=sys.stderr, + ) + return 0 + + def build_parser() -> argparse.ArgumentParser: summary = (__doc__ or "cscript").splitlines()[0] p = argparse.ArgumentParser(prog="cscript", description=summary) @@ -267,6 +398,20 @@ def build_parser() -> argparse.ArgumentParser: sub.add_parser("list", help="list registered scripts") sub.add_parser("where", help="print the data directory") + sub.add_parser("version", help="print the dispatcher version") + + pm = sub.add_parser( + "mine", + help="surface recurring `which` misses worth compiling", + description="Aggregate the `cscript which` invocation log into ranked " + "candidate tasks. The signal is: same effective query asked N+ times, " + "catalogue had no match.", + ) + pm.add_argument("--min", type=int, default=2, help="minimum repeats to surface (default: 2)") + pm.add_argument("--limit", type=int, default=20, help="max candidates to print (default: 20)") + pm.add_argument( + "--include-hits", action="store_true", help="also include queries that did match something" + ) pw = sub.add_parser("which", help="fuzzy match against names/descriptions") pw.add_argument("query") @@ -297,7 +442,7 @@ def build_parser() -> argparse.ArgumentParser: ) psd.add_argument("name") - preg = sub.add_parser("register", help="(used by the compile-task skill)") + preg = sub.add_parser("register", help="(used by the /cscript skill)") preg.add_argument("--source", required=True, help="path to the freshly-written script file") preg.add_argument("--name", required=True) preg.add_argument("--description", required=True) @@ -324,6 +469,8 @@ def main(argv: list[str] | None = None) -> int: "state-dir": cmd_state_dir, "register": cmd_register, "where": cmd_where, + "version": cmd_version, + "mine": cmd_mine, } return handlers[args.cmd](args) diff --git a/skills/compile-task/scripts/cscript.cmd b/skills/cscript/scripts/cscript.cmd similarity index 100% rename from skills/compile-task/scripts/cscript.cmd rename to skills/cscript/scripts/cscript.cmd diff --git a/skills/compile-task/scripts/smoke-test.sh b/skills/cscript/scripts/smoke-test.sh similarity index 71% rename from skills/compile-task/scripts/smoke-test.sh rename to skills/cscript/scripts/smoke-test.sh index 119b457..4690a1a 100755 --- a/skills/compile-task/scripts/smoke-test.sh +++ b/skills/cscript/scripts/smoke-test.sh @@ -111,5 +111,35 @@ out=$("$CS" list) [[ "$out" == *"hello-again"* ]] || fail "list should still contain hello-again" pass "rm archives script and removes state dir" +# 9. version subcommand +out=$("$CS" version) +[[ -n "$out" ]] || fail "version should print something" +pass "version prints" + +# 10. mine: which calls are logged and mined +# Generate three misses (same effective query) and one hit, then mine. +"$CS" which "convert pdf to html for archiving" >/dev/null 2>&1 || true +"$CS" which "convert pdf to html for archiving" >/dev/null 2>&1 || true +"$CS" which "convert pdf to html for archiving" >/dev/null 2>&1 || true +"$CS" which "hello-again" >/dev/null 2>&1 || true + +[[ -f "$CSCRIPT_DATA_DIR/which.log" ]] || fail "which.log should exist after which calls" +log_lines=$(wc -l < "$CSCRIPT_DATA_DIR/which.log" | tr -d ' ') +[[ "$log_lines" -ge 4 ]] || fail "which.log should have at least 4 entries, got $log_lines" +pass "which appends to invocation log" + +out=$("$CS" mine) +[[ "$out" == *"[3x]"* ]] || fail "mine should show 3x repeat for pdf-to-html query: '$out'" +[[ "$out" == *"convert pdf to html"* ]] || fail "mine should surface the repeated query" +[[ "$out" != *"hello-again"* ]] || fail "mine should exclude hits" +pass "mine ranks repeated misses" + +# 11. mine: empty case message +empty_dir="$(mktemp -d)" +out=$(CSCRIPT_DATA_DIR="$empty_dir" "$CS" mine 2>&1 || true) +[[ "$out" == *"No \`cscript which\` history yet"* ]] || fail "mine empty message missing: '$out'" +rm -rf "$empty_dir" +pass "mine handles empty log" + echo echo "All checks passed."