LunarCommand · chris-colinsky · Jun 12, 2026 · Jun 12, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -15,6 +15,7 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/). The
 
 - **`RetryMiddleware` now takes a `RetryConfig` record** instead of individual constructor kwargs (proposal 0050 prep). The four retry settings (`max_attempts` / `classifier` / `backoff` / `on_retry`, each optional) move onto a frozen `RetryConfig`; construct as `RetryMiddleware(RetryConfig(max_attempts=...))`, while bare `RetryMiddleware()` still applies the defaults. This is a breaking change to the `RetryMiddleware` constructor. The record is the same shape the upcoming call-level `complete(retry=...)` parameter will accept, so one retry config serves both the per-node and per-call layers. `None` fields resolve to the canonical defaults (`default_classifier` / `exponential_jitter_backoff`) at use, preserving the prior behavior.
 - **Failure-isolation events report the originating cause's category at non-node placements** (proposal 0065, pipeline-utilities §6.3). When `FailureIsolationMiddleware` runs as instance middleware (§9.7), branch middleware (§11.7), or parent-node middleware on a fan-out / parallel-branches node, the graph engine has already wrapped the originating error as a `node_exception` carrier before the middleware catches it. `FailureIsolatedEvent.caught_exception.category` now resolves through that carrier (and any nested carriers) to the nearest categorized originating cause and reports its category instead of the masking `node_exception`, so the reported category agrees with what the §6.1 retry classifier acted on. For example, an instance whose retries exhaust on `provider_unavailable` now surfaces `provider_unavailable` rather than `node_exception`. The `message` tracks the resolved cause for category/message coherence. Node-level placement was already faithful and is unchanged, and catch/degrade behavior is unchanged at every site (only the event's reported cause changes). The wrapped-instance/branch lineage SHOULD (`fan_out_index` / `branch_name`) is deferred to a follow-up, since it needs the engine to surface per-instance identity to the wrapping-site middleware.
+- **Observer privacy flag `disable_llm_payload` renamed to `disable_provider_payload`** (proposal 0059, observability §5.5.4, spec v0.54.0). The observer-level flag on both bundled observers (`OTelObserver` and `LangfuseObserver`) is renamed, and its scope broadens from LLM-completion payload to any provider-call payload (LLM completion today; embedding and rerank when those land). This is a breaking change to both observer constructors: config passing `disable_llm_payload=True` (or `False`) updates to `disable_provider_payload=...` with no other change. The default stays `True` (payload suppressed), and the gating behavior for `LlmCompletionEvent` / `LlmFailedEvent` rendering is unchanged at every existing site. The rename is the only part of proposal 0059 adopted this cycle: the retrieval-provider capability itself (the `EmbeddingProvider` protocol, the `EmbeddingEvent` / `EmbeddingFailedEvent` typed variants, and the embedding span / observation mapping) is not yet implemented and rides as `not-yet` in `conformance.toml`. The §5.5.4 rename touches existing LLM-payload gating, so it lands with the pin. Pinned spec advances v0.53.0 → v0.54.0.
 
 ## [0.13.0] — 2026-06-09
 

diff --git a/conformance.toml b/conformance.toml
@@ -32,7 +32,7 @@
 
 [manifest]
 implementation = "openarmature-python"
-spec_pin = "v0.53.0"
+spec_pin = "v0.54.0"
 
 # Status values:
 #   implemented   — shipped behavior matches the proposal's contract
@@ -454,9 +454,9 @@ since = "0.12.0"
 #     not:
 #     ``tests/unit/test_observability_otel.py::
 #     test_invocation_span_carries_implementation_attribution_attributes``;
-#   - OTel always-emit invariant under ``disable_llm_payload``,
+#   - OTel always-emit invariant under ``disable_provider_payload``,
 #     ``disable_genai_semconv``, ``disable_llm_spans``:
-#     ``::test_invocation_span_attribution_emits_under_disable_llm_payload``;
+#     ``::test_invocation_span_attribution_emits_under_disable_provider_payload``;
 #   - OTel attributes emit on every invocation span across a
 #     reused observer (3 sequential invocations):
 #     ``::test_invocation_span_attribution_emits_on_every_invocation``;
@@ -593,3 +593,15 @@ since = "0.13.0"
 [proposals."0058"]
 status = "implemented"
 since = "0.13.0"
+
+# Spec v0.54.0 (proposal 0059).  Retrieval-provider capability —
+# the ``EmbeddingProvider`` protocol + ``EmbeddingEvent`` /
+# ``EmbeddingFailedEvent`` typed variants + OTel/Langfuse embedding
+# mapping.  Python has not yet shipped the embedding surface, so the
+# capability is not-yet.  The one piece adopted at this pin is the
+# proposal's cross-spec consequence: the observer-level privacy flag
+# ``disable_llm_payload`` is renamed ``disable_provider_payload`` (the
+# §5.5.4 rename touches existing LLM-payload gating, so it lands with
+# the pin even though the embedding capability does not).
+[proposals."0059"]
+status = "not-yet"
diff --git a/docs/agent/non-obvious-shapes.md b/docs/agent/non-obvious-shapes.md
@@ -48,9 +48,9 @@ else:
 
 The discriminator is one branch; missing it gives you empty data on tool-call responses and silently wrong behavior on truncations.
 
-### `disable_llm_payload` defaults to `True`: flip it for LLM-aware observability backends
+### `disable_provider_payload` defaults to `True`: flip it for LLM-aware observability backends
 
-The `OTelObserver` (and any spec-conformant observer reading LLM events) defaults `disable_llm_payload: bool = True` per spec §5.5's "default-off by privacy" framing. Without flipping the flag, LLM spans carry GenAI semconv attributes (token counts, model name, finish reason) but NOT the message payload (input messages, response content, request extras).
+The `OTelObserver` (and any spec-conformant observer reading LLM events) defaults `disable_provider_payload: bool = True` per spec §5.5's "default-off by privacy" framing. Without flipping the flag, LLM spans carry GenAI semconv attributes (token counts, model name, finish reason) but NOT the message payload (input messages, response content, request extras).
 
 That's the right default for general OpenArmature use: payloads may contain PII the user hasn't audited, and storage cost grows with prompt size. But it's the WRONG default if you're wiring up an LLM-aware observability backend (Langfuse, Phoenix, Honeycomb's LLM lens) that renders the message stream as part of its generation view. Backends will show "empty" generations and you'll wonder why.
 
@@ -61,7 +61,7 @@ from openarmature.observability import OTelObserver
 
 observer = OTelObserver(
     span_processor=your_exporter,
-    disable_llm_payload=False,   # opt in to message-payload attributes
+    disable_provider_payload=False,   # opt in to message-payload attributes
 )
 graph.attach_observer(observer)
 ```

diff --git a/docs/concepts/observability.md b/docs/concepts/observability.md
@@ -702,12 +702,12 @@ source on your stack.
 ### LLM payload attributes
 
 By default, LLM spans do **not** carry the messages sent or the
-response content. Opt in with `disable_llm_payload=False`:
+response content. Opt in with `disable_provider_payload=False`:
 
 ```python
 observer = OTelObserver(
     span_processor=SimpleSpanProcessor(exporter),
-    disable_llm_payload=False,
+    disable_provider_payload=False,
 )
 ```
 
@@ -764,7 +764,7 @@ level (per llm-provider §3.1.2); only `source` is replaced. URL-form
 images pass through unchanged: the URL is a short string and is
 informative for trace readers.
 
-Redaction is **not** gated by `disable_llm_payload` and is **not**
+Redaction is **not** gated by `disable_provider_payload` and is **not**
 configurable. Inline image bytes never leave the provider in event
 form, so custom observers consuming
 [`LlmCompletionEvent` / `LlmFailedEvent`](#consuming-llm-events-in-custom-observers)
@@ -967,7 +967,7 @@ langfuse_client = Langfuse(
 )
 observer = LangfuseObserver(
     client=LangfuseSDKAdapter(langfuse_client),
-    disable_llm_payload=False,
+    disable_provider_payload=False,
 )
 ```
 
@@ -1014,15 +1014,15 @@ for a runnable demo.
 
 ### Payload + truncation
 
-`disable_llm_payload` mirrors the OTel observer's flag and defaults
+`disable_provider_payload` mirrors the OTel observer's flag and defaults
 to `True` for the same privacy reason. Flip to `False` to populate
 `generation.input` / `output` / `metadata.request_extras` from the
 LLM event payload.
 
 ```python
 observer = LangfuseObserver(
     client=client,
-    disable_llm_payload=False,
+    disable_provider_payload=False,
     payload_byte_cap=65536,
 )
 ```
@@ -1057,5 +1057,5 @@ graph.attach_observer(otel_observer)
 graph.attach_observer(langfuse_observer)
 ```
 
-Each observer's `disable_llm_spans` / `disable_llm_payload` flag is
+Each observer's `disable_llm_spans` / `disable_provider_payload` flag is
 independent; one MAY emit while the other suppresses.
diff --git a/docs/examples/langfuse-observability.md b/docs/examples/langfuse-observability.md
@@ -39,7 +39,7 @@ manual wiring at the call site.
   surfaces it on every Generation that renders from that prompt.
   Filesystem / in-memory backends without that reference work too,
   they just produce metadata-only linkage.
-- `disable_llm_payload=False` opt-in for capturing input messages +
+- `disable_provider_payload=False` opt-in for capturing input messages +
   output content on Generation observations. Default-off is the
   privacy posture; the demo deliberately flips it.
 - `correlation_id` cross-cutting metadata on the Trace and every
@@ -140,7 +140,7 @@ langfuse_client = Langfuse(
 )
 observer = LangfuseObserver(
     client=LangfuseSDKAdapter(langfuse_client),
-    disable_llm_payload=False,
+    disable_provider_payload=False,
 )
 ```
 
@@ -174,7 +174,7 @@ graph.attach_observer(OTelObserver(span_processor=batch))
 graph.attach_observer(LangfuseObserver(client=langfuse_client))
 ```
 
-Their `disable_llm_spans` / `disable_llm_payload` flags are
+Their `disable_llm_spans` / `disable_provider_payload` flags are
 independent. The `correlation_id` cross-cutting attribute is the join
 key: find a slow Generation in Langfuse, search for the
 `correlation_id` in OTel logs to see the surrounding infrastructure

diff --git a/docs/examples/production-observability.md b/docs/examples/production-observability.md
@@ -175,7 +175,7 @@ answer:      The primary objective of Apollo 11 was ...
 model:       gpt-4o-mini-2024-07-18
 
 --- captured OTel spans ---
-  [openarmature.invocation] 1240.0ms  openarmature.graph.entry_node='respond', openarmature.graph.spec_version='0.53.0', openarmature.implementation.name='openarmature-python', openarmature.implementation.version='0.13.0'
+  [openarmature.invocation] 1240.0ms  openarmature.graph.entry_node='respond', openarmature.graph.spec_version='0.54.0', openarmature.implementation.name='openarmature-python', openarmature.implementation.version='0.13.0'
   [respond] 1235.0ms  openarmature.node.name='respond', openarmature.user.tenantId='demo-acme', ...
   [openarmature.llm.complete] 1200.0ms  openarmature.user.tenantId='demo-acme', gen_ai.system='openai', gen_ai.usage.input_tokens=42, ...
   [persist] 2.0ms  openarmature.node.name='persist', openarmature.user.tenantId='demo-acme', ...
@@ -289,7 +289,7 @@ langfuse_observer = LangfuseObserver(
     ),
     trace_input_from_state=_trace_input,
     trace_output_from_state=_trace_output,
-    disable_llm_payload=False,
+    disable_provider_payload=False,
 )
 ```
 

diff --git a/examples/langfuse-observability/main.py b/examples/langfuse-observability/main.py
@@ -262,13 +262,13 @@ async def main() -> None:
     # ``LangfuseClient`` Protocol; the observer code doesn't change.
     client = InMemoryLangfuseClient()
 
-    # disable_llm_payload=False opts in to capturing the input messages
+    # disable_provider_payload=False opts in to capturing the input messages
     # and output content on Generation observations. Default is True
     # for the same privacy reason the OTel observer's flag exists:
     # payloads may contain PII the operator hasn't audited. Flip it
     # deliberately here because the demo's whole point is showing what
     # the model saw and returned.
-    observer = LangfuseObserver(client=client, disable_llm_payload=False)
+    observer = LangfuseObserver(client=client, disable_provider_payload=False)
 
     graph = build_graph()
     graph.attach_observer(observer)

diff --git a/examples/production-observability/main.py b/examples/production-observability/main.py
@@ -582,14 +582,14 @@ def build_graph() -> CompiledGraph[BriefingState]:
 # any OTLP-compatible backend.
 #
 # Caller hooks attach to LangfuseObserver via constructor kwargs.
-# ``disable_llm_payload=False`` opts in to capturing the input
+# ``disable_provider_payload=False`` opts in to capturing the input
 # messages + output content on Generation observations so the demo
 # output is meaningful; the default-True is the privacy-preserving
 # setting.
 
 
 def _build_otel_observer(exporter: InMemorySpanExporter) -> OTelObserver:
-    # ``disable_llm_payload=False`` opts in to capturing input messages
+    # ``disable_provider_payload=False`` opts in to capturing input messages
     # + output content on the LLM-call span (same flag the Langfuse
     # observer below flips for the same reason).  The example's whole
     # point is showing both backends seeing the same logical events;
@@ -600,14 +600,14 @@ def _build_otel_observer(exporter: InMemorySpanExporter) -> OTelObserver:
     return OTelObserver(
         span_processor=SimpleSpanProcessor(exporter),
         resource=Resource.create({"service.name": "openarmature-production-observability"}),
-        disable_llm_payload=False,
+        disable_provider_payload=False,
     )
 
 
 def _build_langfuse_observer(client: InMemoryLangfuseClient) -> LangfuseObserver:
     return LangfuseObserver(
         client=client,
-        disable_llm_payload=False,
+        disable_provider_payload=False,
         trace_input_from_state=_trace_input,
         trace_output_from_state=_trace_output,
     )

diff --git a/openarmature-spec b/openarmature-spec
diff --git a/pyproject.toml b/pyproject.toml
@@ -63,7 +63,7 @@ Specification = "https://github.com/LunarCommand/openarmature-spec"
 openarmature = "openarmature.cli:main"
 
 [tool.openarmature]
-spec_version = "0.53.0"
+spec_version = "0.54.0"
 
 [dependency-groups]
 dev = [

diff --git a/src/openarmature/AGENTS.md b/src/openarmature/AGENTS.md
@@ -1,6 +1,6 @@
 # OpenArmature — Agent documentation
 
-*This is the agent guide bundled with the openarmature Python package, version 0.13.0 (spec v0.53.0). For the full docs site see [openarmature.ai](https://openarmature.ai). For the canonical spec text see [openarmature.org/capabilities](https://openarmature.org/capabilities/). For project-specific conventions for the code you're editing, see the host project's `AGENTS.md` or `CLAUDE.md`.*
+*This is the agent guide bundled with the openarmature Python package, version 0.13.0 (spec v0.54.0). For the full docs site see [openarmature.ai](https://openarmature.ai). For the canonical spec text see [openarmature.org/capabilities](https://openarmature.org/capabilities/). For project-specific conventions for the code you're editing, see the host project's `AGENTS.md` or `CLAUDE.md`.*
 
 ## TL;DR
 
@@ -10,7 +10,7 @@ OpenArmature is a workflow framework for LLM pipelines and tool-calling agents:
 
 ## Capability contracts
 
-_Sourced from openarmature-spec v0.53.0. Each entry below reproduces §1 (Purpose) and §2 (Concepts) of the capability's `spec.md` verbatim — including additions from accepted proposals that this Python implementation may not yet ship. For per-proposal implementation status (implemented / partial / textual-only / not-yet), see the `conformance.toml` manifest at the repo root. For the full spec text (execution model, error semantics, determinism, observer hooks, etc.) see the linked docs site._
+_Sourced from openarmature-spec v0.54.0. Each entry below reproduces §1 (Purpose) and §2 (Concepts) of the capability's `spec.md` verbatim — including additions from accepted proposals that this Python implementation may not yet ship. For per-proposal implementation status (implemented / partial / textual-only / not-yet), see the `conformance.toml` manifest at the repo root. For the full spec text (execution model, error semantics, determinism, observer hooks, etc.) see the linked docs site._
 
 ### Capability: `graph-engine`
 
@@ -1377,9 +1377,9 @@ else:
 
 The discriminator is one branch; missing it gives you empty data on tool-call responses and silently wrong behavior on truncations.
 
-### `disable_llm_payload` defaults to `True`: flip it for LLM-aware observability backends
+### `disable_provider_payload` defaults to `True`: flip it for LLM-aware observability backends
 
-The `OTelObserver` (and any spec-conformant observer reading LLM events) defaults `disable_llm_payload: bool = True` per spec §5.5's "default-off by privacy" framing. Without flipping the flag, LLM spans carry GenAI semconv attributes (token counts, model name, finish reason) but NOT the message payload (input messages, response content, request extras).
+The `OTelObserver` (and any spec-conformant observer reading LLM events) defaults `disable_provider_payload: bool = True` per spec §5.5's "default-off by privacy" framing. Without flipping the flag, LLM spans carry GenAI semconv attributes (token counts, model name, finish reason) but NOT the message payload (input messages, response content, request extras).
 
 That's the right default for general OpenArmature use: payloads may contain PII the user hasn't audited, and storage cost grows with prompt size. But it's the WRONG default if you're wiring up an LLM-aware observability backend (Langfuse, Phoenix, Honeycomb's LLM lens) that renders the message stream as part of its generation view. Backends will show "empty" generations and you'll wonder why.
 
@@ -1390,7 +1390,7 @@ from openarmature.observability import OTelObserver
 
 observer = OTelObserver(
     span_processor=your_exporter,
-    disable_llm_payload=False,   # opt in to message-payload attributes
+    disable_provider_payload=False,   # opt in to message-payload attributes
 )
 graph.attach_observer(observer)
 ```

diff --git a/src/openarmature/__init__.py b/src/openarmature/__init__.py
@@ -25,7 +25,7 @@
 """
 
 __version__ = "0.13.0"
-__spec_version__ = "0.53.0"
+__spec_version__ = "0.54.0"
 # Proposal 0052 (spec observability §5.1 / §8.4.1): canonical
 # package-registry name for this implementation. Surfaces on every
 # OTel invocation span as ``openarmature.implementation.name`` and on

diff --git a/src/openarmature/graph/events.py b/src/openarmature/graph/events.py
@@ -461,7 +461,7 @@ class InvocationCompletedEvent:
 # which already enforces the redaction. The three payload-bearing
 # fields (input_messages, output_content, request_extras) are
 # populated unconditionally on the typed event per §5.5.7; observer-
-# side privacy gates (OTel disable_llm_payload, Langfuse equivalents)
+# side privacy gates (OTel disable_provider_payload, Langfuse equivalents)
 # apply at rendering, symmetric with the §5.5.1 span attribute path.
 # Custom queryable observers (per observability §9) own their own
 # redaction posture — gating belongs at rendering with the consumer's
@@ -597,7 +597,7 @@ class LlmCompletionEvent:
 #
 # Privacy posture identical to LlmCompletionEvent: input_messages /
 # request_params / request_extras are populated unconditionally per
-# §5.5.7; observer-side privacy gates (OTel disable_llm_payload,
+# §5.5.7; observer-side privacy gates (OTel disable_provider_payload,
 # Langfuse equivalents) apply at rendering. Inline image bytes are
 # redacted per observability §5.5.5 before population. Custom
 # queryable observers own their own redaction posture.
+22 −0		CHANGELOG.md
+3 −2		README.md
+1 −0		docs/capabilities/retrieval-provider.md
+9 −0		docs/compatibility.md
+68 −0		docs/open-questions.md
+2 −1		docs/proposals.md
+1 −0		docs/proposals/0059-retrieval-provider-embedding.md
+1 −0		mkdocs.yml
+627 −0		proposals/0059-retrieval-provider-embedding.md
+96 −4		spec/graph-engine/spec.md
+4 −4		spec/observability/conformance/012-otel-llm-payload-default-off.md
+2 −2		spec/observability/conformance/012-otel-llm-payload-default-off.yaml
+3 −3		spec/observability/conformance/013-otel-llm-payload-enabled.md
+3 −3		spec/observability/conformance/013-otel-llm-payload-enabled.yaml
+1 −1		spec/observability/conformance/014-otel-llm-payload-truncation.md
+2 −2		spec/observability/conformance/014-otel-llm-payload-truncation.yaml
+1 −1		spec/observability/conformance/015-otel-llm-payload-image-redaction.md
+2 −2		spec/observability/conformance/015-otel-llm-payload-image-redaction.yaml
+4 −4		spec/observability/conformance/018-otel-llm-request-extras.md
+3 −3		spec/observability/conformance/018-otel-llm-request-extras.yaml
+1 −1		spec/observability/conformance/022-langfuse-basic-trace.yaml
+2 −2		spec/observability/conformance/023-langfuse-generation-rendering.md
+4 −4		spec/observability/conformance/023-langfuse-generation-rendering.yaml
+1 −1		spec/observability/conformance/037-langfuse-trace-input-output.md
+32 −0		spec/observability/conformance/074-embedding-event-dispatch.md
+76 −0		spec/observability/conformance/074-embedding-event-dispatch.yaml
+33 −0		spec/observability/conformance/075-embedding-failure-event-dispatch-on-provider-unavailable.md
+63 −0		spec/observability/conformance/075-embedding-failure-event-dispatch-on-provider-unavailable.yaml
+25 −0		spec/observability/conformance/076-embedding-event-mutual-exclusion.md
+85 −0		spec/observability/conformance/076-embedding-event-mutual-exclusion.yaml
+24 −0		spec/observability/conformance/077-embedding-event-call-id-distinct.md
+59 −0		spec/observability/conformance/077-embedding-event-call-id-distinct.yaml
+26 −0		spec/observability/conformance/078-embedding-event-input-strings-populated.md
+46 −0		spec/observability/conformance/078-embedding-event-input-strings-populated.yaml
+28 −0		spec/observability/conformance/079-embedding-event-request-params-populated.md
+87 −0		spec/observability/conformance/079-embedding-event-request-params-populated.yaml
+26 −0		spec/observability/conformance/080-embedding-event-input-count-and-dimensions-populated.md
+49 −0		spec/observability/conformance/080-embedding-event-input-count-and-dimensions-populated.yaml
+27 −0		spec/observability/conformance/081-embedding-event-active-prompt-populated.md
+76 −0		spec/observability/conformance/081-embedding-event-active-prompt-populated.yaml
+35 −0		spec/observability/conformance/082-otel-embedding-span-attributes.md
+65 −0		spec/observability/conformance/082-otel-embedding-span-attributes.yaml
+39 −0		spec/observability/conformance/083-langfuse-embedding-observation.md
+123 −0		spec/observability/conformance/083-langfuse-embedding-observation.yaml
+146 −16		spec/observability/spec.md
+34 −0		spec/retrieval-provider/conformance/001-embed-positive-control.md
+80 −0		spec/retrieval-provider/conformance/001-embed-positive-control.yaml
+32 −0		spec/retrieval-provider/conformance/002-embed-model-binding-error.md
+42 −0		spec/retrieval-provider/conformance/002-embed-model-binding-error.yaml
+27 −0		spec/retrieval-provider/conformance/003-embed-malformed-response-mismatched-vector-count.md
+43 −0		spec/retrieval-provider/conformance/003-embed-malformed-response-mismatched-vector-count.yaml
+26 −0		spec/retrieval-provider/conformance/004-embed-malformed-response-inconsistent-dimensions.md
+45 −0		spec/retrieval-provider/conformance/004-embed-malformed-response-inconsistent-dimensions.yaml
+25 −0		spec/retrieval-provider/conformance/005-embed-input-order-preserved.md
+52 −0		spec/retrieval-provider/conformance/005-embed-input-order-preserved.yaml
+220 −0		spec/retrieval-provider/spec.md