diff --git a/CHANGELOG.md b/CHANGELOG.md index 87d3154..38b2806 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -15,6 +15,7 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). The - **`RetryMiddleware` now takes a `RetryConfig` record** instead of individual constructor kwargs (proposal 0050 prep). The four retry settings (`max_attempts` / `classifier` / `backoff` / `on_retry`, each optional) move onto a frozen `RetryConfig`; construct as `RetryMiddleware(RetryConfig(max_attempts=...))`, while bare `RetryMiddleware()` still applies the defaults. This is a breaking change to the `RetryMiddleware` constructor. The record is the same shape the upcoming call-level `complete(retry=...)` parameter will accept, so one retry config serves both the per-node and per-call layers. `None` fields resolve to the canonical defaults (`default_classifier` / `exponential_jitter_backoff`) at use, preserving the prior behavior. - **Failure-isolation events report the originating cause's category at non-node placements** (proposal 0065, pipeline-utilities §6.3). When `FailureIsolationMiddleware` runs as instance middleware (§9.7), branch middleware (§11.7), or parent-node middleware on a fan-out / parallel-branches node, the graph engine has already wrapped the originating error as a `node_exception` carrier before the middleware catches it. `FailureIsolatedEvent.caught_exception.category` now resolves through that carrier (and any nested carriers) to the nearest categorized originating cause and reports its category instead of the masking `node_exception`, so the reported category agrees with what the §6.1 retry classifier acted on. For example, an instance whose retries exhaust on `provider_unavailable` now surfaces `provider_unavailable` rather than `node_exception`. The `message` tracks the resolved cause for category/message coherence. Node-level placement was already faithful and is unchanged, and catch/degrade behavior is unchanged at every site (only the event's reported cause changes). The wrapped-instance/branch lineage SHOULD (`fan_out_index` / `branch_name`) is deferred to a follow-up, since it needs the engine to surface per-instance identity to the wrapping-site middleware. +- **Observer privacy flag `disable_llm_payload` renamed to `disable_provider_payload`** (proposal 0059, observability §5.5.4, spec v0.54.0). The observer-level flag on both bundled observers (`OTelObserver` and `LangfuseObserver`) is renamed, and its scope broadens from LLM-completion payload to any provider-call payload (LLM completion today; embedding and rerank when those land). This is a breaking change to both observer constructors: config passing `disable_llm_payload=True` (or `False`) updates to `disable_provider_payload=...` with no other change. The default stays `True` (payload suppressed), and the gating behavior for `LlmCompletionEvent` / `LlmFailedEvent` rendering is unchanged at every existing site. The rename is the only part of proposal 0059 adopted this cycle: the retrieval-provider capability itself (the `EmbeddingProvider` protocol, the `EmbeddingEvent` / `EmbeddingFailedEvent` typed variants, and the embedding span / observation mapping) is not yet implemented and rides as `not-yet` in `conformance.toml`. The §5.5.4 rename touches existing LLM-payload gating, so it lands with the pin. Pinned spec advances v0.53.0 → v0.54.0. ## [0.13.0] — 2026-06-09 diff --git a/conformance.toml b/conformance.toml index 4cdd9ac..35e8103 100644 --- a/conformance.toml +++ b/conformance.toml @@ -32,7 +32,7 @@ [manifest] implementation = "openarmature-python" -spec_pin = "v0.53.0" +spec_pin = "v0.54.0" # Status values: # implemented — shipped behavior matches the proposal's contract @@ -454,9 +454,9 @@ since = "0.12.0" # not: # ``tests/unit/test_observability_otel.py:: # test_invocation_span_carries_implementation_attribution_attributes``; -# - OTel always-emit invariant under ``disable_llm_payload``, +# - OTel always-emit invariant under ``disable_provider_payload``, # ``disable_genai_semconv``, ``disable_llm_spans``: -# ``::test_invocation_span_attribution_emits_under_disable_llm_payload``; +# ``::test_invocation_span_attribution_emits_under_disable_provider_payload``; # - OTel attributes emit on every invocation span across a # reused observer (3 sequential invocations): # ``::test_invocation_span_attribution_emits_on_every_invocation``; @@ -593,3 +593,15 @@ since = "0.13.0" [proposals."0058"] status = "implemented" since = "0.13.0" + +# Spec v0.54.0 (proposal 0059). Retrieval-provider capability — +# the ``EmbeddingProvider`` protocol + ``EmbeddingEvent`` / +# ``EmbeddingFailedEvent`` typed variants + OTel/Langfuse embedding +# mapping. Python has not yet shipped the embedding surface, so the +# capability is not-yet. The one piece adopted at this pin is the +# proposal's cross-spec consequence: the observer-level privacy flag +# ``disable_llm_payload`` is renamed ``disable_provider_payload`` (the +# §5.5.4 rename touches existing LLM-payload gating, so it lands with +# the pin even though the embedding capability does not). +[proposals."0059"] +status = "not-yet" diff --git a/docs/agent/non-obvious-shapes.md b/docs/agent/non-obvious-shapes.md index b2bd6ec..21a4aaa 100644 --- a/docs/agent/non-obvious-shapes.md +++ b/docs/agent/non-obvious-shapes.md @@ -48,9 +48,9 @@ else: The discriminator is one branch; missing it gives you empty data on tool-call responses and silently wrong behavior on truncations. -### `disable_llm_payload` defaults to `True`: flip it for LLM-aware observability backends +### `disable_provider_payload` defaults to `True`: flip it for LLM-aware observability backends -The `OTelObserver` (and any spec-conformant observer reading LLM events) defaults `disable_llm_payload: bool = True` per spec §5.5's "default-off by privacy" framing. Without flipping the flag, LLM spans carry GenAI semconv attributes (token counts, model name, finish reason) but NOT the message payload (input messages, response content, request extras). +The `OTelObserver` (and any spec-conformant observer reading LLM events) defaults `disable_provider_payload: bool = True` per spec §5.5's "default-off by privacy" framing. Without flipping the flag, LLM spans carry GenAI semconv attributes (token counts, model name, finish reason) but NOT the message payload (input messages, response content, request extras). That's the right default for general OpenArmature use: payloads may contain PII the user hasn't audited, and storage cost grows with prompt size. But it's the WRONG default if you're wiring up an LLM-aware observability backend (Langfuse, Phoenix, Honeycomb's LLM lens) that renders the message stream as part of its generation view. Backends will show "empty" generations and you'll wonder why. @@ -61,7 +61,7 @@ from openarmature.observability import OTelObserver observer = OTelObserver( span_processor=your_exporter, - disable_llm_payload=False, # opt in to message-payload attributes + disable_provider_payload=False, # opt in to message-payload attributes ) graph.attach_observer(observer) ``` diff --git a/docs/concepts/observability.md b/docs/concepts/observability.md index 1974993..2dad1ce 100644 --- a/docs/concepts/observability.md +++ b/docs/concepts/observability.md @@ -702,12 +702,12 @@ source on your stack. ### LLM payload attributes By default, LLM spans do **not** carry the messages sent or the -response content. Opt in with `disable_llm_payload=False`: +response content. Opt in with `disable_provider_payload=False`: ```python observer = OTelObserver( span_processor=SimpleSpanProcessor(exporter), - disable_llm_payload=False, + disable_provider_payload=False, ) ``` @@ -764,7 +764,7 @@ level (per llm-provider §3.1.2); only `source` is replaced. URL-form images pass through unchanged: the URL is a short string and is informative for trace readers. -Redaction is **not** gated by `disable_llm_payload` and is **not** +Redaction is **not** gated by `disable_provider_payload` and is **not** configurable. Inline image bytes never leave the provider in event form, so custom observers consuming [`LlmCompletionEvent` / `LlmFailedEvent`](#consuming-llm-events-in-custom-observers) @@ -967,7 +967,7 @@ langfuse_client = Langfuse( ) observer = LangfuseObserver( client=LangfuseSDKAdapter(langfuse_client), - disable_llm_payload=False, + disable_provider_payload=False, ) ``` @@ -1014,7 +1014,7 @@ for a runnable demo. ### Payload + truncation -`disable_llm_payload` mirrors the OTel observer's flag and defaults +`disable_provider_payload` mirrors the OTel observer's flag and defaults to `True` for the same privacy reason. Flip to `False` to populate `generation.input` / `output` / `metadata.request_extras` from the LLM event payload. @@ -1022,7 +1022,7 @@ LLM event payload. ```python observer = LangfuseObserver( client=client, - disable_llm_payload=False, + disable_provider_payload=False, payload_byte_cap=65536, ) ``` @@ -1057,5 +1057,5 @@ graph.attach_observer(otel_observer) graph.attach_observer(langfuse_observer) ``` -Each observer's `disable_llm_spans` / `disable_llm_payload` flag is +Each observer's `disable_llm_spans` / `disable_provider_payload` flag is independent; one MAY emit while the other suppresses. diff --git a/docs/examples/langfuse-observability.md b/docs/examples/langfuse-observability.md index 6e74d9c..027bf83 100644 --- a/docs/examples/langfuse-observability.md +++ b/docs/examples/langfuse-observability.md @@ -39,7 +39,7 @@ manual wiring at the call site. surfaces it on every Generation that renders from that prompt. Filesystem / in-memory backends without that reference work too, they just produce metadata-only linkage. -- `disable_llm_payload=False` opt-in for capturing input messages + +- `disable_provider_payload=False` opt-in for capturing input messages + output content on Generation observations. Default-off is the privacy posture; the demo deliberately flips it. - `correlation_id` cross-cutting metadata on the Trace and every @@ -140,7 +140,7 @@ langfuse_client = Langfuse( ) observer = LangfuseObserver( client=LangfuseSDKAdapter(langfuse_client), - disable_llm_payload=False, + disable_provider_payload=False, ) ``` @@ -174,7 +174,7 @@ graph.attach_observer(OTelObserver(span_processor=batch)) graph.attach_observer(LangfuseObserver(client=langfuse_client)) ``` -Their `disable_llm_spans` / `disable_llm_payload` flags are +Their `disable_llm_spans` / `disable_provider_payload` flags are independent. The `correlation_id` cross-cutting attribute is the join key: find a slow Generation in Langfuse, search for the `correlation_id` in OTel logs to see the surrounding infrastructure diff --git a/docs/examples/production-observability.md b/docs/examples/production-observability.md index 26cb97f..cdeb084 100644 --- a/docs/examples/production-observability.md +++ b/docs/examples/production-observability.md @@ -175,7 +175,7 @@ answer: The primary objective of Apollo 11 was ... model: gpt-4o-mini-2024-07-18 --- captured OTel spans --- - [openarmature.invocation] 1240.0ms openarmature.graph.entry_node='respond', openarmature.graph.spec_version='0.53.0', openarmature.implementation.name='openarmature-python', openarmature.implementation.version='0.13.0' + [openarmature.invocation] 1240.0ms openarmature.graph.entry_node='respond', openarmature.graph.spec_version='0.54.0', openarmature.implementation.name='openarmature-python', openarmature.implementation.version='0.13.0' [respond] 1235.0ms openarmature.node.name='respond', openarmature.user.tenantId='demo-acme', ... [openarmature.llm.complete] 1200.0ms openarmature.user.tenantId='demo-acme', gen_ai.system='openai', gen_ai.usage.input_tokens=42, ... [persist] 2.0ms openarmature.node.name='persist', openarmature.user.tenantId='demo-acme', ... @@ -289,7 +289,7 @@ langfuse_observer = LangfuseObserver( ), trace_input_from_state=_trace_input, trace_output_from_state=_trace_output, - disable_llm_payload=False, + disable_provider_payload=False, ) ``` diff --git a/examples/langfuse-observability/main.py b/examples/langfuse-observability/main.py index a2fdc75..493ec0b 100644 --- a/examples/langfuse-observability/main.py +++ b/examples/langfuse-observability/main.py @@ -262,13 +262,13 @@ async def main() -> None: # ``LangfuseClient`` Protocol; the observer code doesn't change. client = InMemoryLangfuseClient() - # disable_llm_payload=False opts in to capturing the input messages + # disable_provider_payload=False opts in to capturing the input messages # and output content on Generation observations. Default is True # for the same privacy reason the OTel observer's flag exists: # payloads may contain PII the operator hasn't audited. Flip it # deliberately here because the demo's whole point is showing what # the model saw and returned. - observer = LangfuseObserver(client=client, disable_llm_payload=False) + observer = LangfuseObserver(client=client, disable_provider_payload=False) graph = build_graph() graph.attach_observer(observer) diff --git a/examples/production-observability/main.py b/examples/production-observability/main.py index 8f82efc..6d77568 100644 --- a/examples/production-observability/main.py +++ b/examples/production-observability/main.py @@ -582,14 +582,14 @@ def build_graph() -> CompiledGraph[BriefingState]: # any OTLP-compatible backend. # # Caller hooks attach to LangfuseObserver via constructor kwargs. -# ``disable_llm_payload=False`` opts in to capturing the input +# ``disable_provider_payload=False`` opts in to capturing the input # messages + output content on Generation observations so the demo # output is meaningful; the default-True is the privacy-preserving # setting. def _build_otel_observer(exporter: InMemorySpanExporter) -> OTelObserver: - # ``disable_llm_payload=False`` opts in to capturing input messages + # ``disable_provider_payload=False`` opts in to capturing input messages # + output content on the LLM-call span (same flag the Langfuse # observer below flips for the same reason). The example's whole # point is showing both backends seeing the same logical events; @@ -600,14 +600,14 @@ def _build_otel_observer(exporter: InMemorySpanExporter) -> OTelObserver: return OTelObserver( span_processor=SimpleSpanProcessor(exporter), resource=Resource.create({"service.name": "openarmature-production-observability"}), - disable_llm_payload=False, + disable_provider_payload=False, ) def _build_langfuse_observer(client: InMemoryLangfuseClient) -> LangfuseObserver: return LangfuseObserver( client=client, - disable_llm_payload=False, + disable_provider_payload=False, trace_input_from_state=_trace_input, trace_output_from_state=_trace_output, ) diff --git a/openarmature-spec b/openarmature-spec index bd2d782..c7d6d4e 160000 --- a/openarmature-spec +++ b/openarmature-spec @@ -1 +1 @@ -Subproject commit bd2d7824e5db40280899cd664442d9c5ac8fe506 +Subproject commit c7d6d4ef4d02573ef2bf15d977e23ba86d4ce584 diff --git a/pyproject.toml b/pyproject.toml index aee46c1..30ec2e1 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -63,7 +63,7 @@ Specification = "https://github.com/LunarCommand/openarmature-spec" openarmature = "openarmature.cli:main" [tool.openarmature] -spec_version = "0.53.0" +spec_version = "0.54.0" [dependency-groups] dev = [ diff --git a/src/openarmature/AGENTS.md b/src/openarmature/AGENTS.md index a5fd775..1319888 100644 --- a/src/openarmature/AGENTS.md +++ b/src/openarmature/AGENTS.md @@ -1,6 +1,6 @@ # OpenArmature — Agent documentation -*This is the agent guide bundled with the openarmature Python package, version 0.13.0 (spec v0.53.0). For the full docs site see [openarmature.ai](https://openarmature.ai). For the canonical spec text see [openarmature.org/capabilities](https://openarmature.org/capabilities/). For project-specific conventions for the code you're editing, see the host project's `AGENTS.md` or `CLAUDE.md`.* +*This is the agent guide bundled with the openarmature Python package, version 0.13.0 (spec v0.54.0). For the full docs site see [openarmature.ai](https://openarmature.ai). For the canonical spec text see [openarmature.org/capabilities](https://openarmature.org/capabilities/). For project-specific conventions for the code you're editing, see the host project's `AGENTS.md` or `CLAUDE.md`.* ## TL;DR @@ -10,7 +10,7 @@ OpenArmature is a workflow framework for LLM pipelines and tool-calling agents: ## Capability contracts -_Sourced from openarmature-spec v0.53.0. Each entry below reproduces §1 (Purpose) and §2 (Concepts) of the capability's `spec.md` verbatim — including additions from accepted proposals that this Python implementation may not yet ship. For per-proposal implementation status (implemented / partial / textual-only / not-yet), see the `conformance.toml` manifest at the repo root. For the full spec text (execution model, error semantics, determinism, observer hooks, etc.) see the linked docs site._ +_Sourced from openarmature-spec v0.54.0. Each entry below reproduces §1 (Purpose) and §2 (Concepts) of the capability's `spec.md` verbatim — including additions from accepted proposals that this Python implementation may not yet ship. For per-proposal implementation status (implemented / partial / textual-only / not-yet), see the `conformance.toml` manifest at the repo root. For the full spec text (execution model, error semantics, determinism, observer hooks, etc.) see the linked docs site._ ### Capability: `graph-engine` @@ -1377,9 +1377,9 @@ else: The discriminator is one branch; missing it gives you empty data on tool-call responses and silently wrong behavior on truncations. -### `disable_llm_payload` defaults to `True`: flip it for LLM-aware observability backends +### `disable_provider_payload` defaults to `True`: flip it for LLM-aware observability backends -The `OTelObserver` (and any spec-conformant observer reading LLM events) defaults `disable_llm_payload: bool = True` per spec §5.5's "default-off by privacy" framing. Without flipping the flag, LLM spans carry GenAI semconv attributes (token counts, model name, finish reason) but NOT the message payload (input messages, response content, request extras). +The `OTelObserver` (and any spec-conformant observer reading LLM events) defaults `disable_provider_payload: bool = True` per spec §5.5's "default-off by privacy" framing. Without flipping the flag, LLM spans carry GenAI semconv attributes (token counts, model name, finish reason) but NOT the message payload (input messages, response content, request extras). That's the right default for general OpenArmature use: payloads may contain PII the user hasn't audited, and storage cost grows with prompt size. But it's the WRONG default if you're wiring up an LLM-aware observability backend (Langfuse, Phoenix, Honeycomb's LLM lens) that renders the message stream as part of its generation view. Backends will show "empty" generations and you'll wonder why. @@ -1390,7 +1390,7 @@ from openarmature.observability import OTelObserver observer = OTelObserver( span_processor=your_exporter, - disable_llm_payload=False, # opt in to message-payload attributes + disable_provider_payload=False, # opt in to message-payload attributes ) graph.attach_observer(observer) ``` diff --git a/src/openarmature/__init__.py b/src/openarmature/__init__.py index 40863ed..2f7fd3f 100644 --- a/src/openarmature/__init__.py +++ b/src/openarmature/__init__.py @@ -25,7 +25,7 @@ """ __version__ = "0.13.0" -__spec_version__ = "0.53.0" +__spec_version__ = "0.54.0" # Proposal 0052 (spec observability §5.1 / §8.4.1): canonical # package-registry name for this implementation. Surfaces on every # OTel invocation span as ``openarmature.implementation.name`` and on diff --git a/src/openarmature/graph/events.py b/src/openarmature/graph/events.py index 807c2ac..e8eaf52 100644 --- a/src/openarmature/graph/events.py +++ b/src/openarmature/graph/events.py @@ -461,7 +461,7 @@ class InvocationCompletedEvent: # which already enforces the redaction. The three payload-bearing # fields (input_messages, output_content, request_extras) are # populated unconditionally on the typed event per §5.5.7; observer- -# side privacy gates (OTel disable_llm_payload, Langfuse equivalents) +# side privacy gates (OTel disable_provider_payload, Langfuse equivalents) # apply at rendering, symmetric with the §5.5.1 span attribute path. # Custom queryable observers (per observability §9) own their own # redaction posture — gating belongs at rendering with the consumer's @@ -597,7 +597,7 @@ class LlmCompletionEvent: # # Privacy posture identical to LlmCompletionEvent: input_messages / # request_params / request_extras are populated unconditionally per -# §5.5.7; observer-side privacy gates (OTel disable_llm_payload, +# §5.5.7; observer-side privacy gates (OTel disable_provider_payload, # Langfuse equivalents) apply at rendering. Inline image bytes are # redacted per observability §5.5.5 before population. Custom # queryable observers own their own redaction posture. diff --git a/src/openarmature/observability/langfuse/observer.py b/src/openarmature/observability/langfuse/observer.py index 445a6c7..384fc6c 100644 --- a/src/openarmature/observability/langfuse/observer.py +++ b/src/openarmature/observability/langfuse/observer.py @@ -11,7 +11,7 @@ # caller-supplied invocation-label path lands in proposal 0034 (PR 4 # of the v0.10.0 batch). # - Generation rendering follows §8.7: input/output/request_extras -# appear only when `disable_llm_payload=False`; the truncation +# appear only when `disable_provider_payload=False`; the truncation # marker is preserved verbatim as a raw string when the §5.5.5 # truncation makes the JSON unparseable. # - Prompt linkage follows §8.4.4: reads @@ -267,10 +267,11 @@ class LangfuseObserver: - ``client``: the Langfuse sink (Protocol-typed). - ``disable_llm_spans``: when ``True`` the observer skips Generation observations on LLM provider events. - - ``disable_llm_payload``: default ``True`` per §8.9's "symmetric + - ``disable_provider_payload``: default ``True`` per §8.9's "symmetric privacy posture" with the OTel observer. Gates ``generation.input`` / ``output`` / ``metadata.request_extras`` - emission. + emission. The name carries the broadened provider-payload scope; + LLM completion is OA's only provider-call payload today. - ``payload_byte_cap``: per-attribute byte cap on the source payload string before parse-back. Mirrors the OTel observer's ``payload_max_bytes`` semantic — emission preserves the raw @@ -292,7 +293,7 @@ class LangfuseObserver: ``trace_output_from_state`` overrides. When ``False`` the raw state object is serialized to the Trace fields, subject to ``payload_byte_cap`` truncation. Independent of - ``disable_llm_payload`` — the two payloads carry distinct + ``disable_provider_payload`` — the two payloads carry distinct threat models (LLM-call transcript vs. application state). - ``trace_input_from_state``: optional caller hook returning the value to use as ``trace.input``. Called once per invocation at @@ -310,7 +311,7 @@ class LangfuseObserver: ``trace.metadata.implementation_version`` on every Trace. Defaults to ``openarmature.__version__``. Always-emit invariant inherited from §5.1 — not gated by ``disable_state_payload``, - ``disable_llm_payload``, or any other privacy knob. + ``disable_provider_payload``, or any other privacy knob. The observer reads the spec version from the package at construction time. Safe to share across concurrent invocations @@ -320,7 +321,7 @@ class LangfuseObserver: client: LangfuseClient disable_llm_spans: bool = False - disable_llm_payload: bool = True + disable_provider_payload: bool = True payload_byte_cap: int = 65536 detached_subgraphs: frozenset[str] = field(default_factory=_empty_str_frozenset) detached_fan_outs: frozenset[str] = field(default_factory=_empty_str_frozenset) @@ -1422,7 +1423,7 @@ def _handle_typed_llm_completion(self, event: LlmCompletionEvent) -> None: model_parameters: dict[str, Any] = dict(event.request_params or {}) input_value: Any = None output_value: Any = None - if not self.disable_llm_payload: + if not self.disable_provider_payload: if event.input_messages: input_value = self._maybe_truncate_for_input(event.input_messages) if event.output_content is not None: @@ -1491,7 +1492,7 @@ def _handle_typed_llm_failed(self, event: LlmFailedEvent) -> None: metadata["error_message"] = event.error_message model_parameters: dict[str, Any] = dict(event.request_params or {}) input_value: Any = None - if not self.disable_llm_payload: + if not self.disable_provider_payload: if event.input_messages: input_value = self._maybe_truncate_for_input(event.input_messages) if event.request_extras: diff --git a/src/openarmature/observability/otel/observer.py b/src/openarmature/observability/otel/observer.py index eb66372..27cff6b 100644 --- a/src/openarmature/observability/otel/observer.py +++ b/src/openarmature/observability/otel/observer.py @@ -390,10 +390,12 @@ class OTelObserver: each get their own trace. One detached trace per instance. - ``disable_llm_spans``: when ``True`` the observer skips the LLM provider span; all other spans emit normally. - - ``disable_llm_payload``: default ``True``. Gates the LLM input/ + - ``disable_provider_payload``: default ``True``. Gates the LLM input/ output payload attributes (``openarmature.llm.input.messages``, ``openarmature.llm.output.content``, - ``openarmature.llm.request.extras``). + ``openarmature.llm.request.extras``). The name carries the broadened + provider-payload scope; LLM completion is the only provider-call + payload OA emits today. - ``disable_genai_semconv``: default ``False``. Gates the ``gen_ai.*`` attribute set on the LLM span. - ``payload_max_bytes``: per-attribute byte cap for the LLM payload @@ -435,12 +437,12 @@ class OTelObserver: detached_subgraphs: frozenset[str] = field(default_factory=_empty_str_frozenset) detached_fan_outs: frozenset[str] = field(default_factory=_empty_str_frozenset) disable_llm_spans: bool = False - # disable_llm_payload defaults to True per observability §5.5.4. + # disable_provider_payload defaults to True per observability §5.5.4. # Default-off because the payload may contain PII the user hasn't # audited — opting in is a deliberate second choice. Naming inverts # the natural reading ("default-off via True") to keep symmetry # with the existing disable_llm_spans parameter family. - disable_llm_payload: bool = True + disable_provider_payload: bool = True # disable_genai_semconv defaults to False (emit) per §5.5.4. The # value proposition of installing the OTel observer is that # LLM-aware backends (Langfuse, Phoenix, Honeycomb's LLM lens) @@ -467,7 +469,7 @@ class OTelObserver: # ``implementation_version`` is ``openarmature.__version__``. # Configurable for test parameterization but defaults to the # package-pinned values; the always-emit invariant means neither - # ``disable_state_payload``, ``disable_llm_payload``, nor any + # ``disable_state_payload``, ``disable_provider_payload``, nor any # other privacy knob gates them. implementation_name: str = field(default_factory=_read_implementation_name) implementation_version: str = field(default_factory=_read_implementation_version) @@ -1096,7 +1098,7 @@ def _emit_checkpoint_save_span(self, event: NodeEvent) -> None: # v0.17.0 attribute set (proposal 0024) preserved unchanged: # - Baseline openarmature.llm.* attributes # - §5.5.1 payload (input.messages, output.content, - # request.extras) gated by disable_llm_payload + # request.extras) gated by disable_provider_payload # - §5.5.2 gen_ai.request.* request params # - §5.5.3 gen_ai.* response semconv set # - §5.5.4 opt-out flags @@ -1176,7 +1178,7 @@ def _handle_typed_llm_completion(self, event: LlmCompletionEvent) -> None: attrs["gen_ai.request.presence_penalty"] = request_params["presence_penalty"] if "stop_sequences" in request_params: attrs["gen_ai.request.stop_sequences"] = request_params["stop_sequences"] - if not self.disable_llm_payload: + if not self.disable_provider_payload: if event.input_messages: serialized = _serialize_for_attribute(event.input_messages) attrs["openarmature.llm.input.messages"] = _truncate_for_attribute( @@ -1230,7 +1232,7 @@ def _handle_typed_llm_completion(self, event: LlmCompletionEvent) -> None: # (tool-call-only responses) MUST NOT emit this attribute per # spec — ``output_content`` is already None in that case (see # provider.py). - if not self.disable_llm_payload and event.output_content: + if not self.disable_provider_payload and event.output_content: attrs_out = _truncate_for_attribute(event.output_content, self.payload_max_bytes) span.set_attribute("openarmature.llm.output.content", attrs_out) span.set_status(Status(StatusCode.OK)) @@ -1303,7 +1305,7 @@ def _handle_typed_llm_failed(self, event: LlmFailedEvent) -> None: attrs["gen_ai.request.presence_penalty"] = request_params["presence_penalty"] if "stop_sequences" in request_params: attrs["gen_ai.request.stop_sequences"] = request_params["stop_sequences"] - if not self.disable_llm_payload: + if not self.disable_provider_payload: if event.input_messages: serialized = _serialize_for_attribute(event.input_messages) attrs["openarmature.llm.input.messages"] = _truncate_for_attribute( diff --git a/tests/conformance/harness/fixtures.py b/tests/conformance/harness/fixtures.py index a318114..4ac1daa 100644 --- a/tests/conformance/harness/fixtures.py +++ b/tests/conformance/harness/fixtures.py @@ -230,11 +230,11 @@ class GraphFixture(_ForbidExtras): disable_llm_spans: bool | None = None # Proposal 0024 (v0.17.0): observer-level opt-outs for the new # §5.5.1 payload and §5.5.2/§5.5.3 GenAI semconv attribute sets. - # ``disable_llm_payload`` defaults to True per §5.5.4 — fixtures + # ``disable_provider_payload`` defaults to True per §5.5.4 — fixtures # that EXERCISE payload emission set it false explicitly (013-018). # ``disable_genai_semconv`` defaults to False — fixture 021 sets # it true to verify the opt-out. - disable_llm_payload: bool | None = None + disable_provider_payload: bool | None = None disable_genai_semconv: bool | None = None # Proposal 0024 (v0.17.0, fixture 020): provider-level configuration # overrides — ``provider.genai_system`` overrides the default diff --git a/tests/conformance/test_fixture_parsing.py b/tests/conformance/test_fixture_parsing.py index 0915f52..63ed1a6 100644 --- a/tests/conformance/test_fixture_parsing.py +++ b/tests/conformance/test_fixture_parsing.py @@ -478,6 +478,33 @@ def _id(case: tuple[str, Path]) -> str: "graph-engine/033-drain-events-for-parallel-branches-coverage": ( "Proposal 0054 fixture-shape models pending; contract pinned by unit tests" ), + # Proposal 0059 (retrieval-provider / embedding, v0.54.0): the + # observability embedding-event fixtures (074-083) model the + # EmbeddingEvent / EmbeddingFailedEvent + embedding span / Langfuse + # observation surface, which python does not implement (0059 is + # not-yet in the manifest; only its cross-spec disable_provider_payload + # rename is adopted). The retrieval-provider fixtures are a new + # capability dir this runner does not parse. + "observability/074-embedding-event-dispatch": "Proposal 0059 embedding events; not implemented", + "observability/075-embedding-failure-event-dispatch-on-provider-unavailable": ( + "Proposal 0059 embedding events; not implemented" + ), + "observability/076-embedding-event-mutual-exclusion": "Proposal 0059 embedding events; not implemented", + "observability/077-embedding-event-call-id-distinct": "Proposal 0059 embedding events; not implemented", + "observability/078-embedding-event-input-strings-populated": ( + "Proposal 0059 embedding events; not implemented" + ), + "observability/079-embedding-event-request-params-populated": ( + "Proposal 0059 embedding events; not implemented" + ), + "observability/080-embedding-event-input-count-and-dimensions-populated": ( + "Proposal 0059 embedding events; not implemented" + ), + "observability/081-embedding-event-active-prompt-populated": ( + "Proposal 0059 embedding events; not implemented" + ), + "observability/082-otel-embedding-span-attributes": "Proposal 0059 embedding events; not implemented", + "observability/083-langfuse-embedding-observation": "Proposal 0059 embedding events; not implemented", } diff --git a/tests/conformance/test_observability.py b/tests/conformance/test_observability.py index 6dcef62..fa0f0fc 100644 --- a/tests/conformance/test_observability.py +++ b/tests/conformance/test_observability.py @@ -2469,8 +2469,8 @@ async def _body(_s: Any) -> dict[str, Any]: # ---- Observer exporter = InMemorySpanExporter() observer_kwargs: dict[str, Any] = {"span_processor": SimpleSpanProcessor(exporter)} - if "disable_llm_payload" in case: - observer_kwargs["disable_llm_payload"] = bool(case["disable_llm_payload"]) + if "disable_provider_payload" in case: + observer_kwargs["disable_provider_payload"] = bool(case["disable_provider_payload"]) if "disable_genai_semconv" in case: observer_kwargs["disable_genai_semconv"] = bool(case["disable_genai_semconv"]) if "disable_llm_spans" in case: diff --git a/tests/conformance/test_observability_langfuse.py b/tests/conformance/test_observability_langfuse.py index c7925ee..c9f17e5 100644 --- a/tests/conformance/test_observability_langfuse.py +++ b/tests/conformance/test_observability_langfuse.py @@ -720,8 +720,8 @@ async def _run_case(case: Mapping[str, Any]) -> None: # ---- Observer observer_cfg = cast("dict[str, Any]", case.get("langfuse_observer") or {}) observer_kwargs: dict[str, Any] = {} - if "disable_llm_payload" in observer_cfg: - observer_kwargs["disable_llm_payload"] = bool(observer_cfg["disable_llm_payload"]) + if "disable_provider_payload" in observer_cfg: + observer_kwargs["disable_provider_payload"] = bool(observer_cfg["disable_provider_payload"]) if "disable_llm_spans" in observer_cfg: observer_kwargs["disable_llm_spans"] = bool(observer_cfg["disable_llm_spans"]) if "payload_byte_cap" in observer_cfg: diff --git a/tests/test_smoke.py b/tests/test_smoke.py index 26af3c2..2547202 100644 --- a/tests/test_smoke.py +++ b/tests/test_smoke.py @@ -9,7 +9,7 @@ def test_package_versions() -> None: assert openarmature.__version__ == "0.13.0" - assert openarmature.__spec_version__ == "0.53.0" + assert openarmature.__spec_version__ == "0.54.0" def test_spec_version_matches_pyproject() -> None: diff --git a/tests/unit/test_observability_langfuse.py b/tests/unit/test_observability_langfuse.py index 8d66c65..44a320f 100644 --- a/tests/unit/test_observability_langfuse.py +++ b/tests/unit/test_observability_langfuse.py @@ -1224,9 +1224,9 @@ async def test_typed_llm_event_emits_generation_with_expected_fields() -> None: from tests._helpers.typed_event import make_typed_event client = InMemoryLangfuseClient() - # disable_llm_payload defaults to True per §8.9; flip it off here + # disable_provider_payload defaults to True per §8.9; flip it off here # so the test can also assert the payload (output) makes it through. - observer = LangfuseObserver(client=client, disable_llm_payload=False) + observer = LangfuseObserver(client=client, disable_provider_payload=False) token = _set_invocation_id("inv-typed-1") try: await observer( diff --git a/tests/unit/test_observability_otel.py b/tests/unit/test_observability_otel.py index 89f5e21..6fd0ba1 100644 --- a/tests/unit/test_observability_otel.py +++ b/tests/unit/test_observability_otel.py @@ -244,11 +244,11 @@ async def test_invocation_span_carries_implementation_attribution_attributes() - # pins the OTel side of the contract; the Langfuse-side equivalent # lives in test_observability_langfuse.py against # disable_state_payload=True. -async def test_invocation_span_attribution_emits_under_disable_llm_payload() -> None: +async def test_invocation_span_attribution_emits_under_disable_provider_payload() -> None: exporter = InMemorySpanExporter() observer = OTelObserver( span_processor=SimpleSpanProcessor(exporter), - disable_llm_payload=True, + disable_provider_payload=True, disable_genai_semconv=True, disable_llm_spans=True, )