Enrich error/success status spans with failure context#253
Conversation
ebcd9a2 to
e9fe6e0
Compare
6581758 to
c73509f
Compare
e9fe6e0 to
059c47b
Compare
c73509f to
057544f
Compare
057544f to
04630ed
Compare
059c47b to
ac197e1
Compare
ac197e1 to
fa70d6d
Compare
04630ed to
2ac38c8
Compare
2ac38c8 to
2025a32
Compare
| _ERROR_TERMINAL_STATUSES = frozenset( | ||
| s.value | ||
| for s in ( | ||
| bts.ContainerExecutionStatus.FAILED, | ||
| bts.ContainerExecutionStatus.SYSTEM_ERROR, | ||
| ) | ||
| ) |
There was a problem hiding this comment.
FYI
_ERROR_TERMINAL_STATUSES = frozenset({
bts.ContainerExecutionStatus.FAILED,
bts.ContainerExecutionStatus.SYSTEM_ERROR,
})
There was a problem hiding this comment.
🤖 Adopted — switched to frozenset({...}) with the enum members directly. Simpler since ContainerExecutionStatus is a str enum.
| ).end(end_time=_ns(dt=t_end)) | ||
|
|
||
| if history[-1]["status"] in _ERROR_TERMINAL_STATUSES: | ||
| root.set_status(status=StatusCode.ERROR) |
There was a problem hiding this comment.
Curious, if you don't set the Status, assuming it's "successful" by default?
There was a problem hiding this comment.
🤖 Correct — OTel default status is UNSET which means "no error". We only explicitly set ERROR for failed/system_error executions.
| assert err_span.attributes["error.message"] == "RuntimeError" | ||
| assert err_span.attributes["exception.stacktrace"] == "Traceback..." |
There was a problem hiding this comment.
Wondering, would
- exception.message
- exception.stacktrace
Make more sense since that aligns with the SQL keys, or is "error" not from an "exception"?
There was a problem hiding this comment.
🤖 Good call — renamed to exception.message to align with OTel semantic conventions.
| if execution.container_execution_id is not None: | ||
| ce = execution.container_execution | ||
| if ce is not None and ce.exit_code is not None: | ||
| attrs["execution.exit_code"] = ce.exit_code |
There was a problem hiding this comment.
I saw you using walrus operator in the upstream PR and it got me thinking
def _error_attrs(*, execution: bts.ExecutionNode, status: str) -> dict[str, object]:
"""Extra attributes for terminal error/success status spans."""
extra = execution.extra_data or {}
attrs: dict[str, object] = {}
def _set_exit_code() -> None:
if (ce := execution.container_execution) and ce.exit_code is not None:
attrs["execution.exit_code"] = ce.exit_code
if status == bts.ContainerExecutionStatus.FAILED:
msg = extra.get(bts.EXECUTION_NODE_EXTRA_DATA_ORCHESTRATION_ERROR_MESSAGE_KEY)
if msg is not None:
attrs["error.message"] = msg
_set_exit_code()
elif status == bts.ContainerExecutionStatus.SYSTEM_ERROR:
msg = extra.get(
bts.EXECUTION_NODE_EXTRA_DATA_SYSTEM_ERROR_EXCEPTION_MESSAGE_KEY
)
if msg is not None:
attrs["error.message"] = msg
tb = extra.get(bts.EXECUTION_NODE_EXTRA_DATA_SYSTEM_ERROR_EXCEPTION_FULL_KEY)
if tb is not None:
attrs["exception.stacktrace"] = tb
elif status == bts.ContainerExecutionStatus.SUCCEEDED:
_set_exit_code()
return attrs
There was a problem hiding this comment.
🤖 Adopted this — cleaner with the walrus operator and _set_exit_code helper.
There was a problem hiding this comment.
PS: This entire stack is written by AI with supervision. To make your suggestions permanent adoptions, please add to the CLAUDE.md.
2025a32 to
d9fcf6b
Compare
FAILED span: error.message from orchestration_error_message, exit_code SYSTEM_ERROR span: error.message + exception.stacktrace from extra_data SUCCEEDED span: exit_code from ContainerExecution Root span: set_status(ERROR) for FAILED and SYSTEM_ERROR terminals
d9fcf6b to
b9f8147
Compare

Enrich execution trace spans with error details and root span status
FAILED span: Attaches
error.messagefromorchestration_error_messageinextra_dataandexecution.exit_codefrom the associatedContainerExecutionwhen available.SYSTEM_ERROR span: Attaches
error.messagefrom the system error exception message andexception.stacktracefrom the full exception traceback stored inextra_data.SUCCEEDED span: Attaches
execution.exit_codefrom the associatedContainerExecutionwhen available.Root span: Calls
set_status(ERROR)when the terminal status isFAILEDorSYSTEM_ERROR, leaving the root span unmarked for successful executions.Screenshots