String-sg · santosral · Jun 9, 2026 · Jun 8, 2026 · Jun 8, 2026 · Jun 8, 2026
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -0,0 +1,107 @@
+# CLAUDE.md
+
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+Stack: SvelteKit (adapter-node) + Svelte 5 (runes) + Prisma (pg adapter) + Vitest, pnpm.
+
+## Commands
+
+Full command table in `README.md`; environment setup and code-style/hooks in `CONTRIBUTING.md`. Most-used:
+
+- `pnpm dev` — dev server
+- `pnpm test` — Vitest (watch). Single run: `pnpm test run <file>`; by name: `pnpm test run -t "<name>"`.
+- `pnpm check` — `svelte-check` + type check (fails on warnings)
+- `pnpm db:generate` after editing `prisma/schema.prisma`; `pnpm db:migrate` to create/apply a migration
+
+## Architecture
+
+Two apps in one SvelteKit project, split by route group, each with its own auth realm:
+
+- `src/routes/(main)/` — learner-facing app (`learner.session` cookie)
+- `src/routes/admin/` — admin app (`admin.session` cookie; requires an active `UserAdmin`)
+
+**Hook dispatch.** Root `src/hooks.server.ts` routes each request to the matching group hook (`(main)/hooks.server.ts` or `admin/hooks.server.ts`) by `/admin` prefix — group hooks are _not_ auto-run by SvelteKit. Each is a `sequence()` of request logging (scoped pino logger + `X-Request-Id` on `event.locals.logger`) then auth/route-protection. **Auth is enforced centrally in these hooks**; endpoint-level `if (!user) 401` is defense-in-depth only.
+
+**Auth.** Custom `Auth(valkey, …)` factory in `src/lib/server/auth/` (Google OAuth, sessions in Valkey). Exposes `learnerAuth` and `adminAuth`.
+
+**Data.** Prisma Client is generated to `src/generated/prisma/` and re-exported (client + enums + model types) from `src/lib/server/db.ts`, which owns the single `db` instance over `@prisma/adapter-pg`. Import Prisma types from `$lib/server/db`, not the generated path.
+
+**Server integrations** (`src/lib/server/`): `s3.ts` + `cloudfront.ts` (media + signed URLs), `openai.ts` (AI chat), `weaviate.ts` (vector search grounding the chat), `valkey.ts` (sessions + cache), `logger.ts` (pino). Feature logic under `auth/`, `chat/`, `unit/`, `cache/`.
+
+**Domain (Prisma).** `Collection` → `LearningUnit` (content, sources, sentiments, tags, status) → `LearningJourney` (+ checkpoints, `QuestionAnswer` quizzes); AI chat via `Thread`/`Message`; onboarding via `UserProfile`/`UserInterest`.
+
+**UI.** Components in `src/lib/components/<Name>/`; shared rune-based state in `src/lib/states/*.svelte.ts`.
+
+## Gotchas
+
+- `src/generated/prisma/` is machine-generated — never hand-edit; regenerate with
+  `pnpm db:generate` after changing `prisma/schema.prisma`. (It's the only place
+  using `type X = {}`; hand-written code uses `interface`.)
+- `vendor/xlsx-0.20.3.tgz` is a vendored dependency, consumed only by the admin
+  quiz export.
+- **Prisma query args** order keys to follow **SQL clause order** — `select`,
+  `where`, `orderBy`, `take`, `skip`, `cursor`. Type the args object with the
+  generated `*FindManyArgs` / `*FindUniqueArgs` via `satisfies`, and derive row
+  types with `*GetPayload<typeof args>`; never `as const`. Keep Prisma type
+  annotations when simplifying code.
+
+## Specs and decisions
+
+Two linked conventions govern design docs:
+
+- **Specs** live in `docs/superpowers/specs/YYYY-MM-DD-<topic>-design.md`. Start
+  from `docs/superpowers/specs/TEMPLATE.md` and follow
+  `docs/superpowers/specs/README.md`. A spec carries the WHAT/WHY plus contracts
+  and boundaries (the unit triple: does / uses / depends, with guarantees and
+  requires) and records only the **chosen** solution. It must NOT contain
+  implementation — no function bodies or loop internals; that belongs in the
+  plan. Cover architecture, components, data flow, error handling, and testing.
+  Write contract signatures as declaration-level TypeScript (no bodies), using
+  `interface` for object shapes (the repo lints `consistent-type-definitions:
+interface`) and reserving `type` for unions, function, and mapped/utility types.
+
+- **Decisions** live in `docs/decisions/NNNN-<title>.md` as
+  [MADR 4.0](https://adr.github.io/madr/) ADRs (start from
+  `docs/decisions/TEMPLATE.md` — the full template; keep its optional
+  sections, including Pros and Cons of the Options). When a spec involves a real
+  architectural choice, the alternatives and rationale go in an ADR; the spec
+  states the chosen outcome and links the ADR. See `docs/decisions/README.md`.
+  **Author order: brainstorm → ADR → spec → plan.** The decision and its
+  rationale are settled in the ADR first; the spec then derives its contracts from
+  the chosen outcome. (The superpowers brainstorming skill writes the spec — pause
+  after the design is approved to write the ADR, then return to the spec.)
+  **ADRs stay prose** — decision, rationale, alternatives, consequences, described
+  conceptually. An ADR may name external/framework symbols it builds on
+  (`RequestEvent`, `Readable.toWeb`) but must NOT reference identifiers the spec
+  defines — no backticked contract names like `fetchBatch`/`onError`, no function
+  signature or `interface` block. Describe the role ("a batch-fetch closure", "an
+  error callback"), not the name, so a rename in the spec can never make an
+  accepted ADR stale; the names live in the spec.
+
+- **Contracts flow down; never back-fill.** The signature is owned by the spec's
+  Contracts & boundaries (declaration-level TS); the ADR holds the rationale in
+  prose; the plan is _derived_ — it may mirror a contract in code for buildability
+  but is never its source of truth. If plan or implementation work reveals a needed
+  change to a signature, parameter, return type, error semantics, or a unit's
+  dependency, **change the source doc first, then regenerate the affected plan
+  section** — never edit the plan and back-fill the spec/ADR. A pure signature
+  change (rename, reorder, add a field) is **spec-only**; an approach change with
+  trade-offs **supersedes the ADR** (new ADR) and updates the spec signature. Plan
+  self-review adds one check: do the signatures/guarantees in the plan match the
+  spec's Contracts & boundaries? Grep the contract name across spec, plan, and ADR.
+
+- **Diagrams:** use Mermaid fenced code blocks in both specs and ADRs.
+
+## GitHub workflow
+
+- Use the `gh` CLI for all GitHub operations (issues, PRs) — never raw git for PR
+  actions and never the web UI.
+- **gh-first, documented up front.** Every implementation plan documents the gh
+  steps at its very start — read the issue (`gh issue view <#>`, when one exists),
+  create the feature branch, open a **draft** PR (`gh pr create --draft`, body per
+  `.github/PULL_REQUEST_TEMPLATE.md`) — and documents marking it ready
+  (`gh pr ready`) as its final step. The gh workflow is established before the
+  first task, not appended at the end.
+- **Document-only; run on approval.** Writing these commands into the plan is not
+  running them. The agent creates no branch, commit, push, or PR until you
+  explicitly ask — the "don't commit/PR unless asked" rule stands.
diff --git a/Dockerfile b/Dockerfile
@@ -27,7 +27,6 @@ ENV PRISMA_SKIP_POSTINSTALL_GENERATE=true
 # Fetch all dependencies into the virtual store.
 COPY pnpm-lock.yaml pnpm-workspace.yaml ./
 COPY patches ./patches
-COPY vendor ./vendor
 
 RUN pnpm fetch
 

diff --git a/docs/decisions/0001-stream-report-exports-with-exceljs.md b/docs/decisions/0001-stream-report-exports-with-exceljs.md
@@ -0,0 +1,69 @@
+---
+status: 'accepted'
+date: 2026-06-08
+decision-makers: santosral
+consulted:
+informed:
+---
+
+# Stream report exports end-to-end with ExcelJS via a generic helper
+
+## Context and Problem Statement
+
+The admin quiz export (`admin/api/download`) loads every matching row into memory, builds the whole workbook, and serializes it to a single buffer before sending, so memory scales linearly with row count. As data grows a single export can spike memory and destabilize the server. We also have a second report (onboarding) coming that would otherwise copy the same buffered pattern. How should report exports be produced so memory stays bounded and the logic is reusable?
+
+## Decision Drivers
+
+- Bounded memory regardless of dataset size — neither the full result set nor the full file should be held in memory.
+- Reusable across reports so additional exports plug in without re-implementing streaming.
+- Keep Prisma's typed query API.
+- Avoid duplicating the error-prone streaming loop per endpoint.
+
+## Considered Options
+
+- Stream end-to-end with `ExcelJS.stream.xlsx.WorkbookWriter` piped to the HTTP response, encapsulated in a single generic streaming helper
+- Buffer-then-send (the current approach) — build the whole workbook in memory, send one buffer
+- Thin streaming utilities only — expose helpers but let each endpoint own its streaming loop
+- A config/registry-driven export framework that endpoints register against
+
+## Decision Outcome
+
+Chosen option: "Stream end-to-end with ExcelJS via a generic streaming helper", because it is the only option that bounds memory on both the DB read and the file write while keeping each endpoint tiny and declarative.
+
+The helper writes through a Node `stream.PassThrough`, whose readable side is converted with `Readable.toWeb()` and returned as the SvelteKit `Response` body. The write loop runs un-awaited so the response starts streaming immediately; because it is not awaited it carries its own `.catch`. Once streaming starts the status and headers are already sent, so a mid-export error cannot become a clean 500 — the helper hands the failure back to the caller through an injected error callback (which logs server-side) and destroys the stream, leaving the browser with a failed/incomplete download to retry. Keeping the helper logger-agnostic and free of any SvelteKit request object decouples it from the framework and makes it trivially unit-testable. (Cursor batching of the DB read is a separate decision — see [ADR-0002](./0002-keyset-cursor-pagination-on-primary-key.md).)
+
+### Consequences
+
+- Good, because memory is bounded regardless of dataset size, removing the DoS vector.
+- Good, because one tested helper owns all the streaming complexity and endpoints only declare their columns, a batch-fetch closure, and an error callback.
+- Good, because migrating the quiz export off the vendored `xlsx` tarball lets us remove that dependency entirely.
+- Bad, because a mid-stream error after headers are sent cannot be a clean error response — the admin sees a broken download and retries.
+- Bad, because the un-awaited write loop must carry its own `.catch` or a failure surfaces as an unhandled rejection.
+- Bad, because it adds `exceljs` as a dependency.
+
+### Confirmation
+
+The quiz endpoint is rewritten to call the streaming helper and the vendored `xlsx` dependency is removed; tests assert the helper drives the batch fetcher to exhaustion, produces a readable stream, and aborts + reports the failure on a fetch error.
+
+## Pros and Cons of the Options
+
+### Stream end-to-end via a generic streaming helper
+
+- Good, because it bounds memory on both read and write.
+- Good, because the streaming loop is written and tested once.
+- Bad, because the abstraction must be generic enough for every report (a columns + batch-fetch contract).
+
+### Buffer-then-send (current approach)
+
+- Good, because errors can still become a clean 500 (nothing is sent yet).
+- Bad, because memory scales with row count — the problem we are removing.
+
+### Thin streaming utilities only
+
+- Good, because no large abstraction to design.
+- Bad, because every endpoint re-implements the error-prone streaming/abort loop.
+
+### Config/registry-driven framework
+
+- Good, because exports become pure data.
+- Bad, because it is over-engineering for the current two reports (YAGNI).
diff --git a/docs/decisions/0002-keyset-cursor-pagination-on-primary-key.md b/docs/decisions/0002-keyset-cursor-pagination-on-primary-key.md
@@ -0,0 +1,61 @@
+---
+status: 'accepted'
+date: 2026-06-08
+decision-makers: santosral
+consulted:
+informed:
+---
+
+# Read streaming exports with keyset cursor pagination on the primary key
+
+## Context and Problem Statement
+
+The streaming export ([ADR-0001](./0001-stream-report-exports-with-exceljs.md)) reads rows from the database in batches inside a loop that runs until the data is exhausted. How should each batch be fetched so the per-batch cost stays constant at any depth and rows are neither skipped nor duplicated under concurrent writes?
+
+## Decision Drivers
+
+- Constant per-batch cost regardless of how deep into the result set the loop has read.
+- Stability under concurrent writes — no skipped or duplicated rows.
+- Keep Prisma's typed query API.
+
+## Considered Options
+
+- Keyset (cursor) pagination ordered by the model's primary key
+- Offset pagination (`skip` / `take`)
+- A true DB server-side cursor via the raw `pg` driver (`pg-query-stream`)
+
+## Decision Outcome
+
+Chosen option: "Keyset pagination ordered by the primary key", because it is linear and index-backed (constant cost per batch) and stable under concurrent writes, while staying within Prisma's typed cursor API.
+
+A consequence is that the export is **no longer ordered by `user.name`** (today's behavior). Name lives on the related `User` and is non-unique, so it cannot back a clean keyset cursor; since an admin can sort any column in Excel, server-side name ordering is dropped in favor of primary-key ordering.
+
+### Consequences
+
+- Good, because batch cost is constant and index-backed at any depth.
+- Good, because the read is stable under concurrent inserts/deletes (no skip/duplicate).
+- Good, because it stays within Prisma's typed API.
+- Bad, because the exported rows are ordered by primary key rather than `user.name`; mitigated because the admin can sort the downloaded file in Excel.
+- Neutral, because a true server-side cursor (`pg-query-stream`) would remove repeated queries entirely but bypasses Prisma's typed API — deferred as an escalation path if scale ever demands it.
+
+### Confirmation
+
+Endpoint tests assert the cursor advances across multiple batches and the `where` filter is honored; the `fetchBatch` contract returns `{ rows, nextCursor }` and the loop terminates when `nextCursor` is undefined.
+
+## Pros and Cons of the Options
+
+### Keyset cursor on the primary key
+
+- Good, because cost per batch is constant and index-backed.
+- Good, because it is stable under concurrent writes.
+- Bad, because ordering is tied to the key, not a human-friendly column.
+
+### Offset pagination (`skip` / `take`)
+
+- Good, because it can order by any column, including `user.name`.
+- Bad, because cost grows with depth and rows can be skipped or duplicated when the underlying set changes mid-export.
+
+### Raw `pg` server-side cursor (`pg-query-stream`)
+
+- Good, because it removes repeated queries entirely.
+- Bad, because it bypasses Prisma's typed API — too much for current scale (deferred).
diff --git a/docs/decisions/README.md b/docs/decisions/README.md
@@ -0,0 +1,40 @@
+# Architecture Decision Records
+
+This directory holds Architecture Decision Records (ADRs) following the
+[MADR 4.0](https://adr.github.io/madr/) standard. Each ADR captures one
+architectural decision: the problem, the options considered, the chosen option,
+and its consequences (good _and_ bad).
+
+## Relationship to specs
+
+Design specs live in [`../superpowers/specs/`](../superpowers/specs/). A spec
+records only the **chosen** solution and links out to the relevant ADR here for
+the full rationale and the alternatives that were rejected. The decision lives in
+the ADR; the spec consumes its outcome.
+
+> Rule of thumb: if you're writing "we considered X but chose Y because…", that
+> belongs in an ADR, not the spec. The spec just states Y and links the ADR.
+
+## Conventions
+
+- **One decision per file**, named `NNNN-kebab-title.md` (zero-padded, e.g.
+  `0001-stream-report-exports-with-exceljs.md`). Numbers are sequential and never
+  reused.
+- **Start from the template:** copy [`TEMPLATE.md`](./TEMPLATE.md) (the
+  official MADR 4.0 full template). Keep its optional sections — including Pros
+  and Cons of the Options — so every decision records its alternatives and
+  rationale.
+- **Frontmatter** carries `status`, `date`, `decision-makers`, `consulted`,
+  `informed`. For a solo/small-team change, `status` + `date` is enough; leave
+  the rest blank.
+- **Status lifecycle:** `proposed` → `accepted` → (later) `deprecated` or
+  `superseded by ADR-NNNN`. A `rejected` option is recorded too if it was
+  seriously considered.
+- **Immutable once accepted:** don't rewrite an accepted ADR. If the decision
+  changes, write a new ADR that supersedes it and update the old one's status.
+
+## Index
+
+- [0001 — Stream report exports end-to-end with ExcelJS via a generic helper](./0001-stream-report-exports-with-exceljs.md) — accepted
+- [0002 — Read streaming exports with keyset cursor pagination on the primary key](./0002-keyset-cursor-pagination-on-primary-key.md) — accepted
+- [0003 — Nested per-report download routes and inline query-param tabs](./0003-report-export-routing-and-tabs.md) — accepted
diff --git a/docs/decisions/TEMPLATE.md b/docs/decisions/TEMPLATE.md
@@ -0,0 +1,85 @@
+---
+# These are optional metadata elements. Feel free to remove any of them.
+status: '{proposed | rejected | accepted | deprecated | … | superseded by ADR-0123}'
+date: { YYYY-MM-DD when the decision was last updated }
+decision-makers: { list everyone involved in the decision }
+consulted:
+  {
+    list everyone whose opinions are sought (typically subject-matter experts); and with whom there is a two-way communication,
+  }
+informed:
+  {
+    list everyone who is kept up-to-date on progress; and with whom there is a one-way communication,
+  }
+---
+
+# {short title, representative of solved problem and found solution}
+
+## Context and Problem Statement
+
+{Describe the context and problem statement, e.g., in free form using two to three sentences or in the form of an illustrative story. You may want to articulate the problem in form of a question and add links to collaboration boards or issue management systems.}
+
+<!-- This is an optional element. Feel free to remove. -->
+
+## Decision Drivers
+
+- {decision driver 1, e.g., a force, facing concern, …}
+- {decision driver 2, e.g., a force, facing concern, …}
+- … <!-- numbers of drivers can vary -->
+
+## Considered Options
+
+- {title of option 1}
+- {title of option 2}
+- {title of option 3}
+- … <!-- numbers of options can vary -->
+
+## Decision Outcome
+
+Chosen option: "{title of option 1}", because {justification. e.g., only option, which meets k.o. criterion decision driver | which resolves force {force} | … | comes out best (see below)}.
+
+<!-- This is an optional element. Feel free to remove. -->
+
+### Consequences
+
+- Good, because {positive consequence, e.g., improvement of one or more desired qualities, …}
+- Bad, because {negative consequence, e.g., compromising one or more desired qualities, …}
+- … <!-- numbers of consequences can vary -->
+
+<!-- This is an optional element. Feel free to remove. -->
+
+### Confirmation
+
+{Describe how the implementation of/compliance with the ADR can/will be confirmed. Are the design that was decided for and its implementation in line with the decision made? E.g., a design/code review or a test with a library such as ArchUnit can help validate this. Note that although we classify this element as optional, it is included in many ADRs.}
+
+<!-- This is an optional element. Feel free to remove. -->
+
+## Pros and Cons of the Options
+
+### {title of option 1}
+
+<!-- This is an optional element. Feel free to remove. -->
+
+{example | description | pointer to more information | …}
+
+- Good, because {argument a}
+- Good, because {argument b}
+- Neutral, because {argument c}
+- Bad, because {argument d}
+- … <!-- numbers of pros and cons can vary -->
+
+### {title of other option}
+
+{example | description | pointer to more information | …}
+
+- Good, because {argument a}
+- Good, because {argument b}
+- Neutral, because {argument c}
+- Bad, because {argument d}
+- …
+
+<!-- This is an optional element. Feel free to remove. -->
+
+## More Information
+
+{You might want to provide additional evidence/confidence for the decision outcome here and/or document the team agreement on the decision and/or define when/how this decision the decision should be realized and if/when it should be re-visited. Links to other decisions and resources might appear here as well.}