Skip to content

feat(telemetry): read indexed/array/tool-call trace content + fix reply-drop#16

Merged
drewstone merged 1 commit into
mainfrom
feat/indexed-content-reader
Jul 1, 2026
Merged

feat(telemetry): read indexed/array/tool-call trace content + fix reply-drop#16
drewstone merged 1 commit into
mainfrom
feat/indexed-content-reader

Conversation

@drewstone

Copy link
Copy Markdown
Contributor

Problem

agent-core's content reader (extractContent / hasContent / resolveDeclaredIntent) matched a flat table of coarse alias keys. Every serious push-OTLP provider emits a turn's prompt / messages / tool-args under indexed / array / nested keys the flat table can't match — so extractContent({"gen_ai.prompt.0.content":"hi"}) returned {}, blanking every downstream analysis for the entire push population (OpenInference: Phoenix/LangGraph/CrewAI/LlamaIndex; OTel-GenAI: LiteLLM/OpenAI-Agents/Pydantic/Vercel/OpenLLMetry).

Separately, a latent reply-drop: both llm.input_messages and llm.output_messages aliased the single messages field (resolved once), so with a prompt present the assistant reply was silently lost.

Solution

normalizeContentAttributes — a pure, non-destructive pre-pass that reconstructs indexed/array/tool-call keys into the canonical aliases the reader already understands, run inside every read path so a new provider's flattening is learned in ONE place:

  • OpenInference llm.input_messages.{i}.message.content, nested …tool_calls.{j}.tool_call.function.arguments
  • OTel-GenAI flattened gen_ai.prompt.{i}.content / gen_ai.completion.{i}.content
  • OTel-GenAI v1.28+ arrays gen_ai.input.messages / gen_ai.output.messages
  • bare tool_call.function.arguments

Input prompt → llm.input_messages array (messages); output reply → gen_ai.completion string (the separate completion field) so a full turn keeps both; tool calls → tool.args/tool.name. Returns the same reference on the metadata-only hot path (single key scan, no allocation).

llm.output_messages is no longer a messages alias — the reply lives in completion.

New exports: normalizeContentAttributes, INDEXED_CONTENT_KEY_LIKE_PATTERNS (SQL LIKE patterns so a DB-side "content seen" check reuses the reader's vocabulary — used by intelligence-api's onboarding gate).

Why here (not a downstream shim)

extractContent is the one reader every consumer uses (eval substrate, coding-agent telemetry, intelligence ingest). Fixing it here means every consumer reads indexed content by construction, instead of each re-implementing a normalizer. This replaces the interim shim shipped in agent-dev-container (content-normalize.ts), which will be deleted and re-pointed at this once published.

Verification

+12 tests (dialect coverage + the input/output reply-drop regression); 414 passing; check-types clean. Changeset: minor.

…ly-drop

The content reader (`extractContent`/`hasContent`/`resolveDeclaredIntent`) matched
a flat table of COARSE alias keys, so the INDEXED / ARRAY / nested shapes every
push-OTLP provider emits were invisible — `extractContent({"gen_ai.prompt.0.content":"hi"})`
returned `{}`, blanking all downstream analysis for the push population
(OpenInference: Phoenix/LangGraph/CrewAI; OTel-GenAI: LiteLLM/OpenAI-Agents/Pydantic/
Vercel/OpenLLMetry).

Add `normalizeContentAttributes` — a pure, non-destructive pre-pass that
reconstructs indexed/array/tool-call keys into the canonical aliases the reader
already understands, run inside every read path so a new provider's flattening is
learned in ONE place. Input prompt → `llm.input_messages` array (`messages`);
output reply → `gen_ai.completion` string (the SEPARATE `completion` field);
tool calls → `tool.args`/`tool.name`. Returns the same reference on the
metadata-only hot path (single key scan, no allocation).

Also fixes a latent reply-drop: both message arrays aliased the single `messages`
field (resolved once), so with a prompt present the reply was silently lost.
`llm.output_messages` is no longer a `messages` alias — the reply is reconstructed
into `completion`, so a full turn keeps BOTH prompt and reply.

New exports: `normalizeContentAttributes`, `INDEXED_CONTENT_KEY_LIKE_PATTERNS`
(SQL LIKE patterns so a DB-side content check reuses the reader's vocabulary).
+12 tests; 414 green; check-types clean.
@drewstone drewstone merged commit 3a8f557 into main Jul 1, 2026
1 check passed
@drewstone drewstone deleted the feat/indexed-content-reader branch July 1, 2026 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant