Skip to content

feat(analyze): actor-tagged turns + user-reaction & skill-adoption analysts#1

Merged
drewstone merged 1 commit into
mainfrom
traces-agent-upgrades
Jun 21, 2026
Merged

feat(analyze): actor-tagged turns + user-reaction & skill-adoption analysts#1
drewstone merged 1 commit into
mainfrom
traces-agent-upgrades

Conversation

@drewstone

Copy link
Copy Markdown
Contributor

Turns traces from "how efficient was the agent" into "how is the agent serving its user, and are our skills used" — the questions a session-trace meta-audit actually needs. All deterministic, $0, no API key.

What

  1. Actor tag on every user.prompt span (tangle.actor: human / subagent-spawn / injected / tool-result), derived from the isSidechain/userType the Claude adapter already parsed but discarded, with synthetic-prompt-marker precedence. Foundational — every user-facing analysis is polluted without it (subagent spawn-prompts look identical to human turns otherwise).
  2. User-reaction analyst (analyzeReactions) — classifies each actor==='human' turn that follows an assistant turn (correction / frustration / praise / jargon-complaint / structure-complaint) and surfaces the top assistant-prose → human-reaction trigger pairs.
  3. Skill + subagent adoption (analyzeAdoption) — skill penetration %, plus explicit invocations and loop-dispatched runs (read from .evolve/skill-runs.jsonl) reported separately, because flat counts undercount loop-dispatched skills (e.g. governor) by ~370×.

Why

Built while meta-auditing 14,541 real Claude/Codex sessions; these three gaps were the difference between a clean corpus and a polluted one. New attributes only — src/attributes.ts wire-contract values untouched.

Verification

  • pnpm typecheck clean; pnpm test 58 passed (+12 new).
  • Real-session smoke: reactions render genuine human corrections (0 wrapper leakage after a marker-precedence fix); adoption shows explicit + loop-dispatched counts side by side.

🤖 Generated with Claude Code

…alysts

- actor tag (human/subagent-spawn/injected) on user.prompt spans from isSidechain/userType + synthetic-marker precedence (src/adapters/actor.ts)
- analyzeReactions: deterministic human-turn reaction classifier (correction/frustration/praise/jargon/structure) + trigger pairs
- analyzeAdoption: skill penetration + explicit invocations AND loop-dispatched runs (.evolve/skill-runs.jsonl) reported separately — fixes the ~370x loop-dispatch undercount
- +12 tests (58 total green), typecheck clean, no wire-contract attrs changed
@drewstone drewstone merged commit ef30a2c into main Jun 21, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant