feat: add embedding tracer support for Bedrock, LiteLLM, OpenAI by viniciusdsmello · Pull Request #631 · openlayer-ai/openlayer-python

viniciusdsmello · 2026-04-28T15:47:37Z

Summary

Adds native embedding tracing support across the Python SDK so that embedding API calls (Titan via Bedrock, litellm.embedding, OpenAI.embeddings.create) generate proper traces in Openlayer with correct model, tokens, dimensions, and output.

Data model: new StepType.EMBEDDING + add_embedding_step_to_trace helper (src/openlayer/lib/tracing/).
Bedrock: detects "embed" in modelId and routes to a dedicated handler with parsers for Titan v1/v2 and Cohere v3 (single + batch). Existing chat path is untouched and locked in by a backfilled regression test.
LiteLLM: patches litellm.embedding alongside the existing litellm.completion patch. Reuses detect_provider_from_response, extract_usage_from_response, and extract_litellm_metadata.
OpenAI: patches client.embeddings.create for both sync (trace_openai) and async (trace_async_openai) clients via a small shared helper module (_openai_embedding_common.py).

Linear

OPEN-10480

Verification

34 new unit tests covering all four integration paths (single input, batch, failure isolation, body replay, regression).
Full local test suite green (448 tests).
ruff check clean on all touched files.
pyright clean on all touched source files.

Test plan

CI passes on this branch.
Manual smoke test with a real Titan embedding call (amazon.titan-embed-text-v2:0) — confirm trace appears with model name and prompt tokens populated.
Manual smoke test with litellm.embedding(model="text-embedding-3-small", input="x").
Confirm with the ingestion / UI team that step_type=embedding is rendered correctly (out of scope for this PR but required for end-to-end value).

Out of scope

Mistral, Gemini, OCI, Portkey embedding tracers — follow-ups using the same pattern.
Backend / UI changes to render the new step type.

🤖 Generated with Claude Code

Used by superpowers workflows to host isolated git worktrees during implementation, never meant to be tracked. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…EN-10480) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…480) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…N-10480) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…OPEN-10480) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…gression (OPEN-10480) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds the same file-level pragma already used by test_portkey_integration.py to suppress reportUnknown* and reportMissingParameterType — these come from openlayer.lib.integrations being in pyright's ignore list, which causes imports from there to be typed as Unknown. Per-line pyright ignores added on direct imports of botocore.response and openai, which are not present in the lint job's environment. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Code review findings addressed: - Move per-call imports of _openai_embedding_common to module-level (was in hot path of every embedding call). - Extract build_embedding_step_kwargs into _openai_embedding_common so that sync and async OpenAI handlers each become ~10 lines instead of ~50, and LiteLLM reuses the same kwargs assembly. - Drop LiteLLM's local _parse_embedding_response and _get_embedding_model_parameters; both now delegate to the shared helpers (LiteLLM-specific timeout/api_base/api_version/cost/metadata are layered on top of the common kwargs). - Type Bedrock _parse_embedding_output return as Tuple[Union[List[float], List[List[float]]], int, int] instead of bare tuple. Net: -34 lines across the 5 touched source files. Tests unchanged, all 77 embedding tests + 448 lib tests still green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

viniciusdsmello · 2026-05-06T21:06:42Z

Code review

Found 4 issues. Issue #1 is a release-blocker — verified by running an embedding call against the PR branch (the trace was silently dropped, with the integration handler logging Failed to trace the OpenAI embedding request with Openlayer. <StepType.EMBEDDING: 'embedding'>).

CRITICAL — StepType.EMBEDDING is added to the enum but never registered in step_factory, and there is no EmbeddingStep class. Every call into add_embedding_step_to_trace raises KeyError(<StepType.EMBEDDING: 'embedding'>) from step_type_mapping[step_type], which the integration handlers swallow via except Exception as e: logger.error(...). Embedding traces never reach the platform. The unit tests don't catch this because they all mock tracer.add_embedding_step_to_trace, bypassing create_step → step_factory entirely (see e.g. tests/test_openai_embedding_integration.py:28). Fix: add an EmbeddingStep(Step) class with typed fields (provider, prompt_tokens, tokens, cost, model, model_parameters, raw_output, embedding_dimensions, embedding_count) and a to_dict() override mirroring ChatCompletionStep (lines 169–206 of the same file), then register it in the mapping. Without an EmbeddingStep even the embedding-specific fields would be dropped by Step.log() because the base class's `hasattr(self, key)` check returns False.

https://github.com/openlayer-ai/openlayer-python/blob/5c4765b8dbc78a3929a41b04ecbb41e666e56dd3/src/openlayer/lib/tracing/steps.py#L355-L367

inference_id=None overwrites the auto-generated UUID with the literal string \"None\". build_embedding_step_kwargs unconditionally puts \"id\": inference_id in the returned dict; `Step.log()` then runs setattr(self, 'id', None) (since `hasattr(step, 'id')` is true — it was just set to a fresh UUID by Step.__init__). `to_dict` later serializes `str(self.id) == "None"`. Empirically verified after applying a local fix for issue Upload a bentoML model to Firebase [UNB-60] #1: the captured trace contained \"id\": \"None\" for the embedding step while the parent step had a proper UUID. The chat path guards against this — see `create_trace_args` at openai_tracer.py:364-365 (`if id: trace_args["id"] = id`). Apply the same guard here.

openlayer-python/src/openlayer/lib/integrations/_openai_embedding_common.py

Lines 78 to 84 in 5c4765b

    
                   if hasattr(response, "model_dump") 
        
                   else str(response) 
        
               ), 
        
               "provider": provider, 
        
               "id": inference_id, 
        
               "metadata": {"provider": provider}, 
        
           }

Azure OpenAI embedding calls are mislabeled as provider=\"OpenAI\". The chat/parse/responses patches all detect is_azure_openai = isinstance(client, openai.AzureOpenAI) at openai_tracer.py:73 and thread it through to the handlers, which switch on it to set provider=\"Azure\". The embeddings patch at openai_tracer.py:162-169 does not pass `is_azure_openai` to `handle_embedding`, and `handle_embedding` hard-codes `provider="OpenAI"`. Same in `async_openai_tracer.py`. An Azure customer's embeddings will be attributed to the wrong provider in the platform.

openlayer-python/src/openlayer/lib/integrations/openai_tracer.py

Lines 1644 to 1660 in 5c4765b

    
               tracer.add_embedding_step_to_trace( 
        
                   **build_embedding_step_kwargs( 
        
                       response, 
        
                       kwargs, 
        
                       start_time, 
        
                       end_time, 
        
                       name="OpenAI Embedding", 
        
                       provider="OpenAI", 
        
                       inference_id=inference_id, 
        
                   ) 
        
               ) 
        
           except Exception as e: 
        
               logger.error( 
        
                   "Failed to trace the OpenAI embedding request with Openlayer. %s", e 
        
               ) 
        
           return response

openlayer-python/src/openlayer/lib/integrations/async_openai_tracer.py

Lines 734 to 750 in 5c4765b

    
               tracer.add_embedding_step_to_trace( 
        
                   **build_embedding_step_kwargs( 
        
                       response, 
        
                       kwargs, 
        
                       start_time, 
        
                       end_time, 
        
                       name="OpenAI Embedding", 
        
                       provider="OpenAI", 
        
                       inference_id=inference_id, 
        
                   ) 
        
               ) 
        
           except Exception as e: 
        
               logger.error( 
        
                   "Failed to trace the OpenAI embedding request with Openlayer. %s", e 
        
               ) 
        
           return response

LiteLLM embedding traces will record provider=\"openai\" (lowercase) instead of \"OpenAI\". `detect_provider_from_response` returns `litellm.get_llm_provider(model_name)[1]` unmodified, which is the lowercase form (e.g. `"openai"`, `"cohere"`). The backend expects the canonical capitalized names (`Anthropic`, `Azure`, `Cohere`, `OpenAI`, `Google`, `Mistral`, `Groq`, `Bedrock`). The completion path at line 273 has the same issue (pre-existing); this PR extends it to embeddings. Worth normalizing in `detect_provider_from_response` to fix both at once.

openlayer-python/src/openlayer/lib/integrations/litellm_tracer.py

Lines 368 to 382 in 5c4765b

    
           try: 
        
               model_name = kwargs.get("model", getattr(response, "model", "unknown")) 
        
               provider = detect_provider_from_response(response, model_name) 
        
               extra_metadata = extract_litellm_metadata(response, model_name) 
        
               usage_data = extract_usage_from_response(response) 
        
               step_kwargs = build_embedding_step_kwargs( 
        
                   response, 
        
                   kwargs, 
        
                   start_time, 
        
                   end_time, 
        
                   name="LiteLLM Embedding", 
        
                   provider=provider, 
        
                   inference_id=inference_id, 
        
               )

Once #1 is fixed, the rest of the implementation produces a correct payload — verified locally with a one-off EmbeddingStep class: token usage (`promptTokens: 10`, `tokens: 10`), `embeddingDimensions: 1536`, `model: text-embedding-3-small`, and the full output vector all serialize correctly.

🤖 Generated with Claude Code

_{- If this code review was useful, please react with 👍. Otherwise, react with 👎.}

viniciusdsmello and others added 17 commits April 28, 2026 12:47

chore: ignore .worktrees/ directory

9b3af00

Used by superpowers workflows to host isolated git worktrees during implementation, never meant to be tracked. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(tracing): add StepType.EMBEDDING enum value (OPEN-10480)

b2e4218

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(tracing): add add_embedding_step_to_trace helper (OPEN-10480)

9d7887a

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

test(bedrock): backfill regression for Anthropic chat invoke path (OP…

5bdf9db

…EN-10480) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(bedrock): route embedding modelIds to dedicated handler (OPEN-10…

9d1b09a

…480) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(bedrock): parse Titan v1/v2 embedding requests (OPEN-10480)

84765e3

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

test(bedrock): cover Cohere embedding batch path (OPEN-10480)

ba3d831

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

test(bedrock): cover embedding failure isolation and body replay (OPE…

c605178

…N-10480) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(litellm): patch litellm.embedding and trace single-input calls (…

64b46a6

…OPEN-10480) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

test(litellm): cover batch embedding parsing (OPEN-10480)

2a60ce5

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

test(litellm): cover embedding cost, failure isolation, completion re…

018fe5d

…gression (OPEN-10480) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(openai): add shared embedding parsing helpers (OPEN-10480)

3e5ce68

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(openai): trace sync embeddings.create (OPEN-10480)

fc456c6

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(openai): trace async embeddings.create (OPEN-10480)

242e066

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chore: apply ruff import sort to embedding test files (OPEN-10480)

6be00ed

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add embedding tracer support for Bedrock, LiteLLM, OpenAI#631

feat: add embedding tracer support for Bedrock, LiteLLM, OpenAI#631
viniciusdsmello wants to merge 17 commits intomainfrom
vini/open-10480-embedding-tracer

viniciusdsmello commented Apr 28, 2026

Uh oh!

viniciusdsmello commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

viniciusdsmello commented Apr 28, 2026

Summary

Linear

Verification

Test plan

Out of scope

Uh oh!

viniciusdsmello commented May 6, 2026

Code review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant