feat(api): playground messages dispatch + prompts.post accepts template_messages (Plan B)#12
Merged
Merged
Conversation
Adds a structured rendered form alongside the existing rendered_prompt
text. Trace replay and re-dispatch read the structured column so the
message list round-trips byte-for-byte; the human-readable text stays
for the trace UI's existing display path.
Backfills old rows as [{role: human, content: <rendered_prompt>}] so
the column can be NOT NULL.
Companion to docs/superpowers/specs/2026-06-07-playground-messages-redesign-design.md.
Signed-off-by: gaurav0107 <gauravdubey0107@gmail.com>
Sister to _render_template - applies {{ var }} substitution per message
body. Missing variables render as empty string (spec decision 9) so the
user can iterate without the renderer fighting them; non-strings
serialize via json.dumps for parity with the legacy single-string path.
Returns a fresh list; never mutates input.
Used by the next-task playground POST handler when the body carries
raw_messages.
Signed-off-by: gaurav0107 <gauravdubey0107@gmail.com>
The two surfaces use different role vocabularies (LangSmith: system / human vs. provider: system / user / assistant / tool). One small mapper bridges them; system passes through. Used in the next task by the playground POST handler. The role translation lives in a single dict so adding ai / tool support later (spec decision 2 deferral) is a one-line change. Signed-off-by: gaurav0107 <gauravdubey0107@gmail.com>
Adds the structured request shape; the legacy raw_template field stays for one release of back-compat. A strict xor validator across prompt_version_id / raw_template / raw_messages enforces "exactly one template source per request" - zero or more than one is now a 422 instead of a 400 from the handler. Empty raw_messages is rejected too so the renderer never sees a zero-length list. The handler-level "at least one" check is removed; the model validator is the single source of truth for this contract. Signed-off-by: gaurav0107 <gauravdubey0107@gmail.com>
The handler now reads raw_messages (or template_messages from a saved prompt, or wraps a legacy raw_template), Jinja-renders each turn's content against the variables dict, persists rendered_messages jsonb alongside the human-readable rendered_prompt, and dispatches the mapped (human -> user) message list to the LiteLLM gateway verbatim. The legacy raw_template path is preserved by wrapping it as a single human message before render - no behavior change for old web clients. The xor validator on PlaygroundCreate (Plan B Task 4) guarantees exactly one template source per request, so the resolver is a straight-line three-branch dispatch with no precedence rules to remember. The old _resolve_template / _render_template single-string pipeline is deleted - rendered_messages and rendered_prompt (newline-joined view) are the canonical pair from here on. The integration test deferred from this commit is a wire-level e2e against the local docker stack; the unit tests on _resolve_messages, _render_messages, _to_dispatch_messages, and the xor validator cover the request-path logic without depending on container health. Signed-off-by: gaurav0107 <gauravdubey0107@gmail.com>
The structured request shape (template_messages: list[Message]) is the preferred field; the legacy 'template: str' stays for one release of back-compat and wraps to a single human message internally. A strict xor validator across the two fields enforces "exactly one source per request" - zero or both is now a 422. The handler now writes both columns (template_messages jsonb + template text, derived) so the row satisfies migration 0026's NOT NULL constraint on the new column. The latent gap where create_version omitted template_messages is closed as a side effect. Adds the no-op short-circuit: if the new messages match the most recent version byte-for-byte (compared via model_dump), return that existing row with HTTP 200 instead of creating a duplicate. Saves a row per accidental save and matches the spec's rule. Signed-off-by: gaurav0107 <gauravdubey0107@gmail.com>
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Plan B of the playground messages redesign. The api can now accept,
render, persist, and dispatch a list of typed messages. Prompts can be
saved as the structured shape too.
Both legacy fields (
raw_templateon POST /v1/playground/runs,templateon POST /v1/prompts/{id}/versions) keep working for onerelease of back-compat; cleanup PR after Plan C lands.
What changed
playground_session.rendered_messages jsonb NOT NULLwith non-empty array CHECK; backfill old rows as[{role: human, content: <rendered_prompt>}]._render_messages— sister to the now-removed_render_template; per-message{{ var }}substitution, missing vars render as""per spec decision 9, returns a fresh list. Shares_coerce_var_valuehelper for json/string coercion._to_dispatch_messages— bridges prompt-sidehumanto dispatch-sideuser;_PROMPT_TO_DISPATCH_ROLEis the single source of truth (typedMapping[Literal, Literal]).PlaygroundCreate.raw_messages— new structured field with strict xor validator acrossprompt_version_id/raw_template/raw_messages. Empty list rejected viaField(min_length=1). Zero-vs-many error messages split.rendered_messages jsonbANDrendered_prompt(newline-joined view) → dispatch via the role-mapped list. Old_resolve_templateand_render_templatedeleted.PromptVersionCreate.template_messages— same xor +to_messages()resolver. Handler writes BOTHtemplate(legacy, derived) ANDtemplate_messagesjsonb so rows satisfy migration 0026's NOT NULL constraint. Latent gap closed as a side effect._derive_legacy_templatehelper — single source of truth for the `template` text derivation rule, called by both create-write and read-hydrate paths so the response shape can never diverge.Files
schemas/postgres/migrations/0027_playground_session_rendered_messages.sqlservices/api/tracebility_api/routers/playground.py— render helpers, role mapping, request model, handlerservices/api/tracebility_api/routers/prompts.py— request model, no-op short-circuit, derived-template helpertest_playground_render_messages.py,test_playground_role_mapping.py,test_playground_create_validation.py,test_playground_resolve_messages.py,test_prompt_version_create_validation.pyTest plan
uv run pytest services/api/tests/unit— 73 passed (60 prior + 13 new)uv run pytest services/api/tests/integration/test_prompts_template_messages.py— cleanDeferred
test_playground_messages_e2e.py,test_prompts_post_template_messages.py) — depends on container health + dev-login auth, brittle in CI. Comparison-contract pinned by unit tests; can land in a follow-up if the unit coverage proves insufficient.Companion docs
docs/superpowers/specs/2026-06-07-playground-messages-redesign-design.mddocs/superpowers/plans/2026-06-07-playground-B-api.md(gitignored)🤖 Generated with Claude Code