[ISSUE #10537] Cache RPC Attributes and use processTimerElapsedMs in NettyRemotingAbstract#10539
Conversation
RockteMQ-AI
left a comment
There was a problem hiding this comment.
Review by github-manager-bot
Summary
Caches per-request-code OpenTelemetry Attributes objects and replaces Stopwatch.elapsed(TimeUnit.MILLISECONDS) with processTimerElapsedMs() (nanoTime-based) across the remoting layer. Also adds cached Constructor lookup for custom header classes and reduces default HashMap capacity. Multiple coordinated micro-optimizations targeting the broker hot path.
Findings
-
[Warning]
NettyRemotingServer.java:432-433— Changedlog.info("channel writable false. host:{}", ...)tolog.debug(...). This is a behavioral change: production users who relied on INFO-level channel writability warnings for backpressure monitoring will lose this visibility. If this was intentional (to reduce log noise), it is fine, but worth confirming. -
[Info]
RemotingMetricsManager.java— The attributes cache uses aConcurrentHashMapwith a volatile fast-path (lastAttrsCacheKey/lastCachedAttrs). The non-atomic read-then-write pattern means two threads with the same cache key could both rebuildAttributes, butcomputeIfAbsentensures correctness. The volatile fast-path is a reasonable optimization for the common case where the same request type dominates. -
[Info]
RemotingMetricsManager.java:214-216— Cache key encoding:(requestCode << 19) | (responseCode << 3) | (isLongPolling ? 4 : 0) | resultIndex. This assumes requestCode fits in 16 bits (max 65535) and responseCode fits in 16 bits. Both are safe given currentResponseCodeandRequestCodevalues. Consider adding a comment documenting these bit-width assumptions for future maintainers. -
[Info]
RemotingCommand.java— ReplacingStopwatch processTimerwithlong processTimeStartNanoseliminates one object allocation per request. The newprocessTimerElapsedMs()method correctly computes elapsed milliseconds fromSystem.nanoTime(). The deprecatedgetProcessTimer()now returns a freshStopwatchwhich would give incorrect elapsed time if called — the@Deprecatedannotation is appropriate. -
[Info]
RemotingCommand.java:231-256— CachedConstructorlookup viaCONSTRUCTOR_CACHE(ConcurrentHashMap withcomputeIfAbsent). Thread-safe. ThesetAccessible(true)call is necessary for non-public constructors. Clean. -
[Info]
RemotingCommand.java:341— HashMap initial capacity reduced from 256 to 16. With default load factor 0.75, this supports 12 entries before resizing — sufficient for typical header field counts. Good optimization. -
[Info]
TopicQueueMappingContext.java— AddedEMPTYstatic singleton to avoid allocating empty context objects. Clean.
Suggestions
-
The log level change in
NettyRemotingServer.java(channel writability) is the most notable behavioral change. If intentional, no action needed. -
Consider adding a brief comment on the cache key encoding in
RemotingMetricsManagerdocumenting the bit-width assumptions (requestCode ≤ 16 bits, responseCode ≤ 16 bits).
Overall: Well-structured set of micro-optimizations with consistent patterns. The Stopwatch → nanoTime replacement is a clean improvement.
Automated review by github-manager-bot
5c2f5cc to
bb222a5
Compare
RockteMQ-AI
left a comment
There was a problem hiding this comment.
Review by github-manager-bot (Re-review after new commits)
Summary
PR squashed into 2 commits. Core approach unchanged: cache OpenTelemetry Attributes objects and replace Stopwatch with long processTimerNanos to eliminate per-RPC allocations. New log.isDebugEnabled() guard added for debug logging.
CI Status
CI is failing — SpotBugs reports 4 ES_COMPARING_PARAMETER_STRING_WITH_EQ violations in RemotingMetricsManager.getOrBuildAttributes() (lines 102-105). The result == RESULT_SUCCESS style comparisons on a method parameter trigger SpotBugs even though the values are interned constants.
Findings
-
[Critical]
RemotingMetricsManager.java:102-105— String==comparison on method parameterresult. Fix: use.equals()or add@SuppressFBWarnings("ES_COMPARING_PARAMETER_STRING_WITH_EQ")with justification that all callers pass interned constant strings. The.equals()approach is safer and preferred. -
[Info]
RemotingMetricsManager.java— Cache key encoding(requestCode << 19) | (responseCode << 3) | (isLongPolling ? 4 : 0) | resultIdxis correct for current value ranges. ThelastAttrsCacheKey/lastCachedAttrsvolatile fast-path is a reasonable optimization for the dominant-case pattern. -
[Info]
RemotingCommand.java—processTimerElapsedMs()correctly computes elapsed ms fromSystem.nanoTime(). The deprecatedgetProcessTimer()returning a freshStopwatchis a safe backward-compat shim. -
[Info]
NettyRemotingAbstract.java—log.isDebugEnabled()guard avoids string concatenation overhead when debug is disabled. Good.
Suggestions
- Fix the SpotBugs violations to unblock CI — this is the only blocker.
- Consider adding a brief comment on the cache key bit-width assumptions for future maintainers.
Automated re-review by github-manager-bot
f5a1d4b to
be98520
Compare
RockteMQ-AI
left a comment
There was a problem hiding this comment.
Review by github-manager-bot (Re-review after new commits)
Summary
New squashed commit. Core approach unchanged: cache OpenTelemetry Attributes via ConcurrentHashMap + volatile fast-path, and replace Stopwatch with long processTimerNanos.
CI Status
CI is still failing — SpotBugs reports 4 ES_COMPARING_PARAMETER_STRING_WITH_EQ violations in RemotingMetricsManager.getOrBuildAttributes():
RemotingMetricsManager.java:102 — result == RESULT_SUCCESS
RemotingMetricsManager.java:103 — result == RESULT_ONEWAY
RemotingMetricsManager.java:104 — result == RESULT_WRITE_CHANNEL_FAILED
RemotingMetricsManager.java:105 — result == RESULT_CANCELED
This was flagged in the previous review but not yet fixed. SpotBugs treats == comparison on a method parameter as a bug even when the constants are interned, because the parameter type is String and callers could pass non-interned strings.
Fix Required
Replace == with .equals() in getOrBuildAttributes():
public Attributes getOrBuildAttributes(int requestCode, int responseCode,
boolean isLongPolling, String result) {
int resultIdx;
if (RESULT_SUCCESS.equals(result)) resultIdx = 0;
else if (RESULT_ONEWAY.equals(result)) resultIdx = 1;
else if (RESULT_WRITE_CHANNEL_FAILED.equals(result)) resultIdx = 2;
else if (RESULT_CANCELED.equals(result)) resultIdx = 3;
else resultIdx = -1;
// ... rest unchangedFindings
-
[Critical]
RemotingMetricsManager.java:102-105— Not fixed. Changeresult == CONSTANTtoCONSTANT.equals(result)to resolve SpotBugsES_COMPARING_PARAMETER_STRING_WITH_EQ. This is blocking CI. -
[Info] The
ConcurrentHashMap<Long, Attributes>+ volatile fast-path caching strategy is sound. ThelastAttrsCacheKey/lastCachedAttrsvolatile pair avoids HashMap lookup on the hottest path. Good optimization. -
[Info] The bit-packing key
((long) requestCode << 19) | ((responseCode & 0xFFFF) << 3) | ...is correct given the value ranges (requestCode fits in 13 bits, responseCode in 16 bits).
Automated review by github-manager-bot
c90c171 to
611d8ea
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #10539 +/- ##
=============================================
- Coverage 48.28% 48.21% -0.08%
+ Complexity 13445 13432 -13
=============================================
Files 1378 1378
Lines 100845 100860 +15
Branches 13044 13049 +5
=============================================
- Hits 48695 48628 -67
- Misses 46201 46263 +62
- Partials 5949 5969 +20 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
RockteMQ-AI
left a comment
There was a problem hiding this comment.
Review by github-manager-bot (Re-review after new commit 611d8ea)
Summary
New commit confirmed — OpenTelemetry Attributes are now cached per request-code/response-code combination using a ConcurrentHashMap. Debug log string concatenation is also guarded with log.isDebugEnabled().
Assessment
- [Correctness] The bit-packing key
(requestCode << 19) | (responseCode << 3) | flagscorrectly encodes the attribute combination. TheresultIdxmapping is consistent with the string constants. Cache invalidation is not needed since the set of request/response codes is finite and small. - [Performance] Eliminates per-request
Attributesobject allocation and string formatting on the metrics hot path. TheisDebugEnabled()guard avoids unnecessary string concatenation. Good optimization. - [Thread Safety]
ConcurrentHashMap.computeIfAbsentis used correctly for thread-safe lazy initialization. - [Info] One minor note: the cache is unbounded. In practice the number of distinct (requestCode, responseCode, isLongPolling, result) combinations is small and bounded, so this is not a concern.
LGTM — all CI checks pass. Clean performance improvement. ✅
Automated by github-manager-bot
611d8ea to
84e42d4
Compare
RockteMQ-AI
left a comment
There was a problem hiding this comment.
Review by github-manager-bot (Re-review after new commit)
Summary
Re-review triggered by new commit 84e42d4a. This PR caches Attributes objects in RemotingMetricsManager to avoid per-RPC allocation on the hot path, and guards debug log string concatenation with isDebugEnabled().
Findings
-
[Info] remoting/.../RemotingMetricsManager.java — The
attributesCache(ConcurrentHashMap<Long, Attributes>) has no eviction policy. In practice, RocketMQ has a bounded set of request/response codes so the cache should stabilize at a manageable size. However, consider documenting the expected upper bound or adding a size guard (e.g.,if (cache.size() > THRESHOLD) return buildAttributes(...)) as a safety net against unexpected growth. -
[Info] remoting/.../RemotingMetricsManager.java:112-115 — Bit-packing key construction:
requestCodeis shifted by 19 bits. This assumesrequestCodefits in 19 bits (0–524287). RocketMQ request codes are well within this range, but a defensive comment documenting the assumption would help future maintainers. -
[Info] remoting/.../NettyRemotingAbstract.java — Good improvement: extracting
processTimerElapsedMsand using it consistently acrossprocessResponseCommand,processRequestCommand, andprocessResponseCommandAsync. TheisDebugEnabled()guard before string concatenation is a solid optimization for the hot path. -
[Info] remoting/.../RemotingMetricsConstant.java — Renaming labels to include
_KEYsuffix improves clarity. Clean migration.
Positive Aspects
- Well-structured
getOrBuildAttributeswith clear result-index mapping - Comprehensive test coverage including cache hit verification and unknown-result bypass
- Thread-safe via
ConcurrentHashMap.computeIfAbsent - Good deduplication of metrics recording across writeResponse methods
LGTM 👍
Automated re-review by github-manager-bot
- RemotingMetricsManager: add getOrBuildAttributes() with long-key cache - NettyRemotingAbstract: use cached Attributes instead of per-RPC AttributesBuilder - NettyRemotingAbstract: add log.isDebugEnabled() guard for debug log - RemotingMetricsConstant: add LABEL_*_KEY static AttributeKey instances
84e42d4 to
f7bef92
Compare
RockteMQ-AI
left a comment
There was a problem hiding this comment.
Review by github-manager-bot
Summary
Caches metrics Attributes objects in a ConcurrentHashMap keyed by a packed long value combining requestCode, responseCode, isLongPolling, and result index. Also introduces processTimerElapsedMs() on RemotingCommand to avoid TimeUnit imports and Stopwatch overhead.
Findings
- [Warning] Key packing in
getOrBuildAttributes()— The key packs requestCode (16 bits) + responseCode (15 bits) + isLongPolling (1 bit) + resultIdx (4 bits) into a long. This assumes response codes fit in 15 bits (max 32767). If RocketMQ ever defines response codes above 32767, there will be silent key collisions. - [Info]
ConcurrentHashMapfor attributes cache — Appropriate for concurrent access. Consider adding a size bound or eviction policy to prevent unbounded growth if the cardinality of (requestCode, responseCode, isLongPolling, result) combinations is high. - [Info]
processTimerElapsedMs()— Clean abstraction. Avoids repeatedTimeUnit.MILLISECONDSimport and call pattern across multiple processors.
Suggestions
- Add a comment documenting the key bit layout and the assumption that response codes fit in 15 bits.
- Consider using
CaffeineorGuavacache with a max size if the cardinality concern is real.
Automated review by github-manager-bot
Which Issue(s) This PR Fixes
Fixes #10537
Brief Description
Cache RPC
Attributesobjects inRemotingMetricsManagerto avoid per-RPCAttributes.builder().build()allocation.getOrBuildAttributes(reqCode, respCode, isLongPolling, result)with a long-key cache (35-bit packed key, no String allocation). Falls back to direct build for unknownresultvalues.getOrBuildAttributes()instead of inlineattributesBuilder.build()at each metrics recording point.log.isDebugEnabled()guard for debug log.How Did You Test This Change
Server benchmark: 8C30G broker, 1KB msg, 128 queues, sync mode, 256 threads. CI checkstyle + compilation passed.