feat(telemetry): read indexed/array/tool-call trace content + fix reply-drop#16
Merged
Conversation
…ly-drop
The content reader (`extractContent`/`hasContent`/`resolveDeclaredIntent`) matched
a flat table of COARSE alias keys, so the INDEXED / ARRAY / nested shapes every
push-OTLP provider emits were invisible — `extractContent({"gen_ai.prompt.0.content":"hi"})`
returned `{}`, blanking all downstream analysis for the push population
(OpenInference: Phoenix/LangGraph/CrewAI; OTel-GenAI: LiteLLM/OpenAI-Agents/Pydantic/
Vercel/OpenLLMetry).
Add `normalizeContentAttributes` — a pure, non-destructive pre-pass that
reconstructs indexed/array/tool-call keys into the canonical aliases the reader
already understands, run inside every read path so a new provider's flattening is
learned in ONE place. Input prompt → `llm.input_messages` array (`messages`);
output reply → `gen_ai.completion` string (the SEPARATE `completion` field);
tool calls → `tool.args`/`tool.name`. Returns the same reference on the
metadata-only hot path (single key scan, no allocation).
Also fixes a latent reply-drop: both message arrays aliased the single `messages`
field (resolved once), so with a prompt present the reply was silently lost.
`llm.output_messages` is no longer a `messages` alias — the reply is reconstructed
into `completion`, so a full turn keeps BOTH prompt and reply.
New exports: `normalizeContentAttributes`, `INDEXED_CONTENT_KEY_LIKE_PATTERNS`
(SQL LIKE patterns so a DB-side content check reuses the reader's vocabulary).
+12 tests; 414 green; check-types clean.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
agent-core's content reader (extractContent/hasContent/resolveDeclaredIntent) matched a flat table of coarse alias keys. Every serious push-OTLP provider emits a turn's prompt / messages / tool-args under indexed / array / nested keys the flat table can't match — soextractContent({"gen_ai.prompt.0.content":"hi"})returned{}, blanking every downstream analysis for the entire push population (OpenInference: Phoenix/LangGraph/CrewAI/LlamaIndex; OTel-GenAI: LiteLLM/OpenAI-Agents/Pydantic/Vercel/OpenLLMetry).Separately, a latent reply-drop: both
llm.input_messagesandllm.output_messagesaliased the singlemessagesfield (resolved once), so with a prompt present the assistant reply was silently lost.Solution
normalizeContentAttributes— a pure, non-destructive pre-pass that reconstructs indexed/array/tool-call keys into the canonical aliases the reader already understands, run inside every read path so a new provider's flattening is learned in ONE place:llm.input_messages.{i}.message.content, nested…tool_calls.{j}.tool_call.function.argumentsgen_ai.prompt.{i}.content/gen_ai.completion.{i}.contentgen_ai.input.messages/gen_ai.output.messagestool_call.function.argumentsInput prompt →
llm.input_messagesarray (messages); output reply →gen_ai.completionstring (the separatecompletionfield) so a full turn keeps both; tool calls →tool.args/tool.name. Returns the same reference on the metadata-only hot path (single key scan, no allocation).llm.output_messagesis no longer amessagesalias — the reply lives incompletion.New exports:
normalizeContentAttributes,INDEXED_CONTENT_KEY_LIKE_PATTERNS(SQLLIKEpatterns so a DB-side "content seen" check reuses the reader's vocabulary — used by intelligence-api's onboarding gate).Why here (not a downstream shim)
extractContentis the one reader every consumer uses (eval substrate, coding-agent telemetry, intelligence ingest). Fixing it here means every consumer reads indexed content by construction, instead of each re-implementing a normalizer. This replaces the interim shim shipped in agent-dev-container (content-normalize.ts), which will be deleted and re-pointed at this once published.Verification
+12tests (dialect coverage + the input/output reply-drop regression); 414 passing;check-typesclean. Changeset: minor.