What happened?
Problem
The @trace_class(kind=SpanKind.SERVER) decorator is applied to EventQueue and related high-frequency classes without an exclude_list:
a2a/server/events/event_queue.py: EventQueue (v0.3.x) / EventQueueLegacy (v1.0.x)
a2a/server/events/event_consumer.py: EventConsumer
a2a/server/events/in_memory_queue_manager.py: InMemoryQueueManager
This causes a span to be created for every call to high-frequency methods like:
EventQueue.enqueue_event — called once per streamed LLM token
EventQueue.dequeue_event — called once per streamed LLM token
EventQueue.task_done — called once per streamed LLM token
For a typical LLM streaming response of ~500 tokens, this generates 1500+ internal spans per session, most of which provide no actionable observability value since they represent fine-grained internal queue operations rather than meaningful request-level events.
Impact
-
Breaks span-quota-limited systems — AWS Bedrock AgentCore Online Evaluation has a hard limit of 1000 spans per evaluated session. Sessions exceeding this limit are silently skipped, leaving evaluations unusable for any A2A-based agent doing non-trivial LLM streaming.
-
Increased observability costs — CloudWatch Logs storage, network bandwidth, and memory overhead for spans that are mostly noise.
-
Approaches span size quotas — A single session with many internal spans can approach the 15 MB/session span data limit.
Environment variable OTEL_INSTRUMENTATION_A2A_SDK_ENABLED=false is too coarse
The existing environment variable disables all A2A tracing including the useful RequestHandler-level spans. There is no way to selectively disable high-frequency internal spans.
Current workaround
Users can apply a runtime monkey-patch at application startup to unwrap the @trace_class decorator on the high-frequency classes, restoring the original methods via __wrapped__ (which is preserved by functools.wraps in trace_function). This is fragile and requires knowledge of internal SDK structure.
Proposed fix
Add an exclude_list (or equivalent) to the @trace_class application on the high-frequency classes. For example:
# a2a/server/events/event_queue.py
@trace_class(
kind=SpanKind.SERVER,
exclude_list=['enqueue_event', 'dequeue_event', 'task_done', 'clear_events'],
)
class EventQueue:
...
Similar changes for EventConsumer and InMemoryQueueManager. The high-frequency internal methods would no longer generate spans, while the class-level tracing decorator is preserved for any other methods that might be added in the future.
Verification
I have verified locally that:
- The
trace_class mechanism already supports exclude_list
- Applying the fix reduces spans from 1500+ to ~53 per session (97% reduction)
- Useful
RequestHandler traces (DefaultRequestHandler, JSONRPCHandler, RESTHandler) and client transport traces are preserved
Happy to submit a PR with the proposed changes if this direction is acceptable.
Relevant log output
Code of Conduct
What happened?
Problem
The
@trace_class(kind=SpanKind.SERVER)decorator is applied toEventQueueand related high-frequency classes without anexclude_list:a2a/server/events/event_queue.py:EventQueue(v0.3.x) /EventQueueLegacy(v1.0.x)a2a/server/events/event_consumer.py:EventConsumera2a/server/events/in_memory_queue_manager.py:InMemoryQueueManagerThis causes a span to be created for every call to high-frequency methods like:
EventQueue.enqueue_event— called once per streamed LLM tokenEventQueue.dequeue_event— called once per streamed LLM tokenEventQueue.task_done— called once per streamed LLM tokenFor a typical LLM streaming response of ~500 tokens, this generates 1500+ internal spans per session, most of which provide no actionable observability value since they represent fine-grained internal queue operations rather than meaningful request-level events.
Impact
Breaks span-quota-limited systems — AWS Bedrock AgentCore Online Evaluation has a hard limit of 1000 spans per evaluated session. Sessions exceeding this limit are silently skipped, leaving evaluations unusable for any A2A-based agent doing non-trivial LLM streaming.
Increased observability costs — CloudWatch Logs storage, network bandwidth, and memory overhead for spans that are mostly noise.
Approaches span size quotas — A single session with many internal spans can approach the 15 MB/session span data limit.
Environment variable
OTEL_INSTRUMENTATION_A2A_SDK_ENABLED=falseis too coarseThe existing environment variable disables all A2A tracing including the useful
RequestHandler-level spans. There is no way to selectively disable high-frequency internal spans.Current workaround
Users can apply a runtime monkey-patch at application startup to unwrap the
@trace_classdecorator on the high-frequency classes, restoring the original methods via__wrapped__(which is preserved byfunctools.wrapsintrace_function). This is fragile and requires knowledge of internal SDK structure.Proposed fix
Add an
exclude_list(or equivalent) to the@trace_classapplication on the high-frequency classes. For example:Similar changes for
EventConsumerandInMemoryQueueManager. The high-frequency internal methods would no longer generate spans, while the class-level tracing decorator is preserved for any other methods that might be added in the future.Verification
I have verified locally that:
trace_classmechanism already supportsexclude_listRequestHandlertraces (DefaultRequestHandler,JSONRPCHandler,RESTHandler) and client transport traces are preservedHappy to submit a PR with the proposed changes if this direction is acceptable.
Relevant log output
Code of Conduct