Skip to content

[ISSUE #10052] Add capacity check in PopBufferMergeService.addCkJustOffset to prevent OOM#10567

Open
wang-jiahua wants to merge 1 commit into
apache:developfrom
wang-jiahua:fix/pop-buffer-merge-service-oom
Open

[ISSUE #10052] Add capacity check in PopBufferMergeService.addCkJustOffset to prevent OOM#10567
wang-jiahua wants to merge 1 commit into
apache:developfrom
wang-jiahua:fix/pop-buffer-merge-service-oom

Conversation

@wang-jiahua

Copy link
Copy Markdown
Contributor

Which Issue(s) This PR Fixes

Fixes #10052

Brief Description

addCk() already guards against buffer overflow with popCkMaxBufferSize, but addCkJustOffset() did not have this check. Under high-concurrency POP consumption with large messages, the buffer ConcurrentHashMap could grow without bound, eventually causing OOM in PopBufferMergeService.

Fix: Add the same counter > popCkMaxBufferSize guard at the beginning of addCkJustOffset(), consistent with addCk().

How Did You Test This Change

Existing PopBufferMergeServiceTest (4 tests) all pass.

Copilot AI review requested due to automatic review settings July 1, 2026 03:59

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a POP buffer growth/OOM risk by adding a popCkMaxBufferSize capacity guard to PopBufferMergeService.addCkJustOffset(), aligning it with the existing protection already present in addCk().

Changes:

  • Add a max-buffer-size check at the start of addCkJustOffset() to prevent unbounded buffer growth under high concurrency.
  • Emit a warning and fail fast when the buffer is over the configured threshold.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov-commenter

codecov-commenter commented Jul 1, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 48.18%. Comparing base (88709c5) to head (2279a41).
⚠️ Report is 2 commits behind head on develop.

Additional details and impacted files
@@              Coverage Diff              @@
##             develop   #10567      +/-   ##
=============================================
- Coverage      48.28%   48.18%   -0.11%     
+ Complexity     13445    13412      -33     
=============================================
  Files           1378     1378              
  Lines         100845   100820      -25     
  Branches       13044    13040       -4     
=============================================
- Hits           48695    48576     -119     
- Misses         46201    46273      +72     
- Partials        5949     5971      +22     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@RockteMQ-AI RockteMQ-AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by github-manager-bot

Summary

Adds the same counter > popCkMaxBufferSize guard to addCkJustOffset() that already exists in addCk(), preventing unbounded growth of the buffer ConcurrentHashMap under high-concurrency POP consumption with large messages.

Findings

  • [Info] PopBufferMergeService.java:448 — The guard is consistent with the existing check in addCk(). Returns false on overflow, matching the method contract.
  • [Info] The warning log includes the checkpoint and current counter value, which is helpful for capacity tuning.

Suggestions

  • Consider adding a unit test that verifies addCkJustOffset() returns false when the buffer is at capacity, mirroring any existing test for addCk().
  • The two methods (addCk and addCkJustOffset) now share the same guard pattern. If more methods are added in the future, extracting a shared checkCapacity() helper would reduce duplication.

Overall: clean, minimal fix for a real OOM risk. LGTM.


Automated review by github-manager-bot

…kJustOffset to prevent OOM

addCk() already guards against buffer overflow with popCkMaxBufferSize,
but addCkJustOffset() did not have this check. Under high-concurrency
POP consumption with large messages, the buffer ConcurrentHashMap could
grow without bound, eventually causing OOM in PopBufferMergeService.

Fix: Add the same counter > popCkMaxBufferSize guard at the beginning
of addCkJustOffset(), consistent with addCk().
@wang-jiahua wang-jiahua force-pushed the fix/pop-buffer-merge-service-oom branch from ac5cd86 to 2279a41 Compare July 1, 2026 05:52
@qianye1001

Copy link
Copy Markdown
Contributor

走 popkv 了,旧实现的 pop 已经被事实上废弃了,不需要修改

@wang-jiahua

Copy link
Copy Markdown
Contributor Author

Thanks for the information @qianye1001! If the old POP implementation is being deprecated in favor of PopKV, then this fix is not needed.

Should I close this PR?

@RockteMQ-AI

Copy link
Copy Markdown
Contributor

Review by github-manager-bot (Re-review after new commit)

Summary

Re-reviewed after commit 2279a41 (2026-07-01T05:52Z). The change adds a counter > popCkMaxBufferSize guard to addCkJustOffset(), mirroring the existing check in addCk().

Assessment: ✅ Looks Good

  • Correctness — The guard is consistent with addCk(). Returns false on overflow, matching callers that already handle this case.
  • Thread safetycounter.get() is an AtomicLong read; popCkMaxBufferSize is a config read. No new race conditions introduced. The check is best-effort (not strictly bounding), which is acceptable for a buffer overflow guard.
  • Logging — Warning log includes the checkpoint and current counter value, useful for debugging.
  • TesttestAddCkJustOffsetMaxSize correctly verifies boundary: first add succeeds (0 > 0 = false), second is rejected (1 > 0 = true).

No new issues found. The fix is minimal, targeted, and well-tested.

@RockteMQ-AI RockteMQ-AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by github-manager-bot (Re-review after new commit)

Changes since last review

  • PopBufferMergeServiceTest.java — Added testAddCkJustOffsetMaxSize() test case. This addresses the previous suggestion. ✅

Test Analysis

The new test properly validates the capacity guard:

  • Sets popCkMaxBufferSize to 0 via mock
  • First addCkJustOffset() succeeds (counter=0, 0 > 0 is false), counter increments to 1
  • Second addCkJustOffset() is rejected (counter=1, 1 > 0 is true)
  • Covers both the success and rejection paths

Verdict

The suggested test has been added and properly validates the capacity check. Code and tests are clean. LGTM 👍


Automated review by github-manager-bot

@RongtongJin RongtongJin left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@qianye1001

Copy link
Copy Markdown
Contributor

建议 close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Default runbroker.sh causes OOM in PopBufferMergeService (26G heap, no backlog)

6 participants