Skip to content

feat: add embedding tracer support for Bedrock, LiteLLM, OpenAI#631

Open
viniciusdsmello wants to merge 17 commits intomainfrom
vini/open-10480-embedding-tracer
Open

feat: add embedding tracer support for Bedrock, LiteLLM, OpenAI#631
viniciusdsmello wants to merge 17 commits intomainfrom
vini/open-10480-embedding-tracer

Conversation

@viniciusdsmello
Copy link
Copy Markdown
Contributor

Summary

Adds native embedding tracing support across the Python SDK so that embedding API calls (Titan via Bedrock, litellm.embedding, OpenAI.embeddings.create) generate proper traces in Openlayer with correct model, tokens, dimensions, and output.

  • Data model: new StepType.EMBEDDING + add_embedding_step_to_trace helper (src/openlayer/lib/tracing/).
  • Bedrock: detects "embed" in modelId and routes to a dedicated handler with parsers for Titan v1/v2 and Cohere v3 (single + batch). Existing chat path is untouched and locked in by a backfilled regression test.
  • LiteLLM: patches litellm.embedding alongside the existing litellm.completion patch. Reuses detect_provider_from_response, extract_usage_from_response, and extract_litellm_metadata.
  • OpenAI: patches client.embeddings.create for both sync (trace_openai) and async (trace_async_openai) clients via a small shared helper module (_openai_embedding_common.py).

Linear

OPEN-10480

Verification

  • 34 new unit tests covering all four integration paths (single input, batch, failure isolation, body replay, regression).
  • Full local test suite green (448 tests).
  • ruff check clean on all touched files.
  • pyright clean on all touched source files.

Test plan

  • CI passes on this branch.
  • Manual smoke test with a real Titan embedding call (amazon.titan-embed-text-v2:0) — confirm trace appears with model name and prompt tokens populated.
  • Manual smoke test with litellm.embedding(model="text-embedding-3-small", input="x").
  • Confirm with the ingestion / UI team that step_type=embedding is rendered correctly (out of scope for this PR but required for end-to-end value).

Out of scope

  • Mistral, Gemini, OCI, Portkey embedding tracers — follow-ups using the same pattern.
  • Backend / UI changes to render the new step type.

🤖 Generated with Claude Code

viniciusdsmello and others added 17 commits April 28, 2026 12:47
Used by superpowers workflows to host isolated git worktrees during
implementation, never meant to be tracked.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…EN-10480)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…480)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…N-10480)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…OPEN-10480)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…gression (OPEN-10480)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the same file-level pragma already used by test_portkey_integration.py
to suppress reportUnknown* and reportMissingParameterType — these come from
openlayer.lib.integrations being in pyright's ignore list, which causes
imports from there to be typed as Unknown.

Per-line pyright ignores added on direct imports of botocore.response and
openai, which are not present in the lint job's environment.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code review findings addressed:
- Move per-call imports of _openai_embedding_common to module-level (was in
  hot path of every embedding call).
- Extract build_embedding_step_kwargs into _openai_embedding_common so that
  sync and async OpenAI handlers each become ~10 lines instead of ~50, and
  LiteLLM reuses the same kwargs assembly.
- Drop LiteLLM's local _parse_embedding_response and
  _get_embedding_model_parameters; both now delegate to the shared helpers
  (LiteLLM-specific timeout/api_base/api_version/cost/metadata are layered
  on top of the common kwargs).
- Type Bedrock _parse_embedding_output return as
  Tuple[Union[List[float], List[List[float]]], int, int] instead of bare
  tuple.

Net: -34 lines across the 5 touched source files. Tests unchanged, all
77 embedding tests + 448 lib tests still green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@viniciusdsmello
Copy link
Copy Markdown
Contributor Author

Code review

Found 4 issues. Issue #1 is a release-blocker — verified by running an embedding call against the PR branch (the trace was silently dropped, with the integration handler logging Failed to trace the OpenAI embedding request with Openlayer. <StepType.EMBEDDING: 'embedding'>).

  1. CRITICAL — StepType.EMBEDDING is added to the enum but never registered in step_factory, and there is no EmbeddingStep class. Every call into add_embedding_step_to_trace raises KeyError(<StepType.EMBEDDING: 'embedding'>) from step_type_mapping[step_type], which the integration handlers swallow via except Exception as e: logger.error(...). Embedding traces never reach the platform. The unit tests don't catch this because they all mock tracer.add_embedding_step_to_trace, bypassing create_stepstep_factory entirely (see e.g. tests/test_openai_embedding_integration.py:28). Fix: add an EmbeddingStep(Step) class with typed fields (provider, prompt_tokens, tokens, cost, model, model_parameters, raw_output, embedding_dimensions, embedding_count) and a to_dict() override mirroring ChatCompletionStep (lines 169–206 of the same file), then register it in the mapping. Without an EmbeddingStep even the embedding-specific fields would be dropped by Step.log() because the base class's `hasattr(self, key)` check returns False.

https://github.com/openlayer-ai/openlayer-python/blob/5c4765b8dbc78a3929a41b04ecbb41e666e56dd3/src/openlayer/lib/tracing/steps.py#L355-L367

  1. inference_id=None overwrites the auto-generated UUID with the literal string \"None\". build_embedding_step_kwargs unconditionally puts \"id\": inference_id in the returned dict; `Step.log()` then runs setattr(self, 'id', None) (since `hasattr(step, 'id')` is true — it was just set to a fresh UUID by Step.__init__). `to_dict` later serializes `str(self.id) == "None"`. Empirically verified after applying a local fix for issue Upload a bentoML model to Firebase [UNB-60] #1: the captured trace contained \"id\": \"None\" for the embedding step while the parent step had a proper UUID. The chat path guards against this — see `create_trace_args` at openai_tracer.py:364-365 (`if id: trace_args["id"] = id`). Apply the same guard here.

if hasattr(response, "model_dump")
else str(response)
),
"provider": provider,
"id": inference_id,
"metadata": {"provider": provider},
}

  1. Azure OpenAI embedding calls are mislabeled as provider=\"OpenAI\". The chat/parse/responses patches all detect is_azure_openai = isinstance(client, openai.AzureOpenAI) at openai_tracer.py:73 and thread it through to the handlers, which switch on it to set provider=\"Azure\". The embeddings patch at openai_tracer.py:162-169 does not pass `is_azure_openai` to `handle_embedding`, and `handle_embedding` hard-codes `provider="OpenAI"`. Same in `async_openai_tracer.py`. An Azure customer's embeddings will be attributed to the wrong provider in the platform.

tracer.add_embedding_step_to_trace(
**build_embedding_step_kwargs(
response,
kwargs,
start_time,
end_time,
name="OpenAI Embedding",
provider="OpenAI",
inference_id=inference_id,
)
)
except Exception as e:
logger.error(
"Failed to trace the OpenAI embedding request with Openlayer. %s", e
)
return response

tracer.add_embedding_step_to_trace(
**build_embedding_step_kwargs(
response,
kwargs,
start_time,
end_time,
name="OpenAI Embedding",
provider="OpenAI",
inference_id=inference_id,
)
)
except Exception as e:
logger.error(
"Failed to trace the OpenAI embedding request with Openlayer. %s", e
)
return response

  1. LiteLLM embedding traces will record provider=\"openai\" (lowercase) instead of \"OpenAI\". `detect_provider_from_response` returns `litellm.get_llm_provider(model_name)[1]` unmodified, which is the lowercase form (e.g. `"openai"`, `"cohere"`). The backend expects the canonical capitalized names (`Anthropic`, `Azure`, `Cohere`, `OpenAI`, `Google`, `Mistral`, `Groq`, `Bedrock`). The completion path at line 273 has the same issue (pre-existing); this PR extends it to embeddings. Worth normalizing in `detect_provider_from_response` to fix both at once.

try:
model_name = kwargs.get("model", getattr(response, "model", "unknown"))
provider = detect_provider_from_response(response, model_name)
extra_metadata = extract_litellm_metadata(response, model_name)
usage_data = extract_usage_from_response(response)
step_kwargs = build_embedding_step_kwargs(
response,
kwargs,
start_time,
end_time,
name="LiteLLM Embedding",
provider=provider,
inference_id=inference_id,
)

Once #1 is fixed, the rest of the implementation produces a correct payload — verified locally with a one-off EmbeddingStep class: token usage (`promptTokens: 10`, `tokens: 10`), `embeddingDimensions: 1536`, `model: text-embedding-3-small`, and the full output vector all serialize correctly.

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant