Designing how agents think — and how humans watch them think.
Most agent demos show you the output. Mind Window is a study in the other layer: the mind (how an agent decides, recalls, and refuses) and the window (how a person watches that decision happen and chooses to trust it).
Each entry is one agent, built end-to-end, documented with its real traces, its design rationale, and an honest account of where it broke.
🔗 Live: https://shivam93.github.io/mind-window/
An agent that reads what you read, builds a memory of it, and answers questions against the whole corpus — citing the specific source behind every claim, and flagging the claims it can't ground.
- 3 LangGraph graphs over one SQLite memory: article ingestion, corpus query, autonomous digest
- 71-article corpus (235 runs deduplicated), 3072-dim embeddings via sqlite-vec
- Every claim carries a
source_type—article/past_brief/web_search/model_knowledge.model_knowledgeis the honest one: "verify me." 18/18 digest citations traceable, 0 orphans. - Built across 28 sessions, model: Claude Sonnet 4.6
The moment that matters: mid-run, the agent detects its own web search drifting off-topic, issues a RETRY, and recovers — captured on a real trace (run-019e6027). Resilience caught live is more convincing than a demo that never stumbles.
- Case study & interactive trace explorer — the full story, with one real run walked across 7 phases (System State / Agent Rationale, side by side)
- Trace as Thinking Map — one real run, drawn as the agent's cognition
- Schema Mind — how the output contract shapes the thinking
- Decision Architecture — when the model, not the code, picks what happens next
- Librarian & Analyst — separating mechanical work from cognitive work
- Trust Discipline — "verify me, not trust me" as trust architecture
- Empirical Method — how testing, not planning, revealed the design
- Design Principles — the extracted laws of agent architecture
- Judgment is the seat. When generation is free, the value is in the decision the agent makes over messy input — not the text it produces.
- The window is the design. An agent's behavior under uncertainty — how it hedges, grounds, and refuses — is the whole interface. Making that legible is the work.
- The failures are the proof. Where an agent breaks, and the tradeoff made when there was no perfect answer, is what separates designed work from a template.
An ongoing agent design study by Shivam Bhatnagar. Built, not generated. All trace data is real — no fabricated outputs.