adapter/sqs: AdminPeekQueue backend (Phase 3)#794
Conversation
Phase 3 of the Admin Queue Peek and Purge design
(docs/design/2026_05_16_proposed_admin_purge_queue.md). Implements
the read-only counterpart to AdminPurgeQueue:
AdminPeekQueue(ctx, principal, name, opts)
([]AdminPeekedMessage, nextCursor string, error)
Peek walks the leader's visibility index at the next read timestamp,
returns up to opts.Limit currently-visible messages (visible_at
less than or equal to now), and bumps NO receive counts. No receipt
handle is minted; no visibility timer is started. Pure read.
New surface:
- AdminPeekedAttribute mirrors the typed SQS MessageAttribute
shape so binary payloads and the DataType discriminator survive
the SPA round-trip (Codex r11 on the design doc).
- AdminPeekedMessage carries body (truncated per BodyMaxBytes),
BodyTruncated, BodyOriginalSize, SentTimestamp, ReceiveCount,
GroupID, DeduplicationID, and the typed attribute map.
- AdminPeekMessageOptions{Limit, Cursor, BodyMaxBytes} with
documented defaults (20, "", 4096) and bounds
([1, 100], 512-byte cursor cap, [256, 262144] body cap).
- peekCursor (versioned JSON, base64url-wrapped) so the SPA can
page across partition boundaries deterministically. Cursor
pins Generation so a purge between pages forces the SPA to
refresh from the front (ErrAdminSQSValidation rather than
silently surfacing rows from a purged generation).
- AdminRole.canRead() - peek requires non-zero role; both
read-only and full principals satisfy the gate. Purge stays
on canWrite() so role-only principals can triage but cannot
drop the queue.
Partitioned FIFO support: peek uses rotated sequential scanning
over partitions, picking the start partition via the same
receiveFanoutCounters the receive path uses for fairness. The
cursor encodes both StartPartition (for the wrap detection) and
the current Partition + LastKey so subsequent pages continue
deterministically.
Caller audit (semantic-change rule): no existing function
signatures changed. canRead is a new method; AdminPeekQueue is a
new entrypoint. The sqsAttributeBaseType{String,Number} constant
extraction in sqs_messages.go fixes a pre-existing goconst
regression that the larger test surface tripped; the validator
switch now reads canonical constants instead of inline literals.
Tests (adapter/sqs_admin_peek_test.go) cover:
- happy path with body / MessageID / SentTimestamp assertions
- peek does NOT bump ApproximateReceiveCount across two calls
- body truncation reports BodyTruncated + BodyOriginalSize
- Limit clamping (0 to default 20, 500 to max 100)
- cursor round-trip across two pages, no overlap, complete
coverage
- stale-generation cursor after purge to ErrAdminSQSValidation
- oversize / bad-base64 / bad-JSON / wrong-version cursors all
map to ErrAdminSQSValidation
- read-only principal allowed; role-less denied
- missing queue to ErrAdminSQSNotFound
- empty / whitespace name to ErrAdminSQSValidation
- DelaySeconds > 0 hides the message until visible_at elapses
- FIFO MessageGroupId / MessageDeduplicationId both projected
- typed MessageAttribute (String + Binary) round-trips
- partitioned FIFO (PartitionCount=4): every sent message
surfaces exactly once across the union of paginated peek
pages, regardless of the rotated start partition
- cursor codec unit tests: round-trip, empty-cursor sentinel,
length cap, generation mismatch
- clampPeekLimit / clampPeekBodyBytes truth tables
- projectPeekedAttributes nil-safety and DataType preservation
- AdminRole.canRead() truth table
Out of scope (deferred to a follow-up):
- Throttle integration (bucketActionAdminPeek + dedicated
per-queue admin-peek bucket per design doc 3.1). Leader-only
+ Limit cap already bound the cost; the dedicated bucket adds
a steady-rate cap that should land alongside the SPA wiring
so the rate-limit metric has a real consumer.
- HTTP handler + bridge - Phase 4.
- principalForReadSensitive live RoleStore re-check - depends
on the wider RoleStore plumbing landing first (design doc
Goal 8, noted as a future direction).
|
@claude review |
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR implements a read-only admin API for peeking into SQS queues without modifying message state, complete with cursor-based pagination, authorization gating, and comprehensive test coverage spanning standard, FIFO, and partitioned queue types. ChangesAdmin SQS Peek API
Sequence DiagramsequenceDiagram
participant Client
participant AdminPeekQueue as AdminPeekQueue
participant Authorization as canRead()
participant QueueMeta as QueueMetadata
participant CursorCodec as Cursor
participant VisIndex as VisibilityIndex
participant DataRecords as DataRecords
participant Projection as Projection
Client->>AdminPeekQueue: AdminPeekQueue(ctx, principal, queue, opts)
AdminPeekQueue->>Authorization: canRead()
Authorization-->>AdminPeekQueue: permitted?
AdminPeekQueue->>QueueMeta: LoadQueueMetadata(MVCC snapshot)
QueueMeta-->>AdminPeekQueue: metadata, generation
AdminPeekQueue->>CursorCodec: DecodePeekCursor(opts.Cursor)
CursorCodec-->>AdminPeekQueue: cursor or error
AdminPeekQueue->>VisIndex: walkPeek(partitions, limit)
VisIndex->>VisIndex: ScanAt per partition
VisIndex->>DataRecords: LoadRecord for each key
DataRecords-->>VisIndex: message or skip
VisIndex->>Projection: Accumulate rows
Projection-->>VisIndex: []AdminPeekedMessage
VisIndex-->>AdminPeekQueue: rows, nextCursor, done
AdminPeekQueue->>CursorCodec: EncodePeekCursor(nextCursor)
CursorCodec-->>AdminPeekQueue: encoded string
AdminPeekQueue-->>Client: rows, continuationToken, error
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Claude finished @bootjp's task in 6m 20s —— View job Code Review
|
There was a problem hiding this comment.
Code Review
This pull request introduces the AdminPeekQueue functionality for SQS, allowing users to inspect messages without affecting their visibility or receive counts. The implementation includes a new canRead authorization gate, support for paginated results via base64url-encoded JSON cursors, and configurable message body truncation. Comprehensive tests have been added to verify the behavior across standard and partitioned FIFO queues, ensuring that the peek operation remains non-destructive and respects security roles. I have no feedback to provide as there were no review comments to assess.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f5838bddb7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if cursor.Generation != meta.Generation { | ||
| return nil, errors.Wrap(ErrAdminSQSValidation, | ||
| "admin peek: cursor is from a prior generation; restart from the front") | ||
| } | ||
| return cursor, nil |
There was a problem hiding this comment.
Reject out-of-range cursor partitions
preparePeekCursor validates only Generation, so a client can submit a cursor whose StartPartition is outside [0, meta.PartitionCount). In walkPeek, termination depends on nextPart == cursor.StartPartition; with an out-of-range start partition this condition is never true, so an empty/fully exhausted partitioned queue causes an unbounded loop of ScanAt calls and the request never returns. This is a request-amplification DoS against the admin endpoint and should be rejected as ErrAdminSQSValidation before scanning.
Useful? React with 👍 / 👎.
Codex r1 P1 on PR #794: walkPeek's partition-rotation termination condition is `(cursor.Partition + 1) % meta.PartitionCount == cursor.StartPartition`. With a client-supplied cursor whose StartPartition lies outside [0, PartitionCount, the modulo cycle never matches and the call loops ScanAt over guaranteed-empty partitions forever — a request-amplification DoS against the admin endpoint. preparePeekCursor now bounds-checks both StartPartition and Partition against max(meta.PartitionCount, 1) and rejects out-of-range values with ErrAdminSQSValidation. Non-partitioned queues (PartitionCount <= 1) require both fields to be 0. Caller audit: peek cursors are constructed in exactly two places — preparePeekCursor itself (now validated) and walkPeek's partition-advance (, which produces values in [0, PartitionCount) by construction). No other caller. Generation validation already handles the purge-between-pages case separately. Tests: regression test feeds a wire-level cursor with StartPartition=999 / Partition=999 against a PartitionCount=4 queue and asserts ErrAdminSQSValidation under a 5-second deadline so any future loss of the bounds check terminates rather than hanging CI. Four preparePeekCursor unit tests pin the truth table (partitioned out-of-range / non-partitioned non-zero / in-range accepted). EOF )
Codex r1 P1 on PR #794: walkPeek's partition-rotation termination condition is "(Partition + 1) mod PartitionCount == StartPartition". With a client-supplied cursor whose StartPartition lies outside [0, PartitionCount), the modulo cycle never matches and the call loops ScanAt over guaranteed-empty partitions forever - a request-amplification DoS against the admin endpoint. preparePeekCursor now bounds-checks both StartPartition and Partition against max(meta.PartitionCount, 1) and rejects out-of-range values with ErrAdminSQSValidation. Non-partitioned queues (PartitionCount <= 1) require both fields to be 0. Caller audit: peek cursors are constructed in exactly two places - preparePeekCursor itself (now validated) and walkPeek's partition-advance ((Partition+1) mod PartitionCount), which produces values in [0, PartitionCount) by construction. No other caller. Generation validation already handles the purge-between- pages case separately. Tests: regression test feeds a wire-level cursor with StartPartition=999 / Partition=999 against a PartitionCount=4 queue and asserts ErrAdminSQSValidation under a 5-second deadline so any future loss of the bounds check terminates rather than hanging CI. Four preparePeekCursor unit tests pin the truth table (partitioned out-of-range / non-partitioned non-zero / in-range accepted).
ed24325 to
9bfaacf
Compare
|
@claude review Addressed Codex r1 P1 in 9bfaacf:
|
|
Claude finished @bootjp's task in 1m 39s —— View job Code Review
|
| Sub-case | Cursor input | Meta | Expected |
|---|---|---|---|
StartPartition >= PartitionCount |
SP=99, P=0 | PC=4 | ErrAdminSQSValidation |
Partition >= PartitionCount |
SP=0, P=99 | PC=4 | ErrAdminSQSValidation |
| Non-partitioned: non-zero StartPartition | SP=1, P=0 | PC=0 | ErrAdminSQSValidation |
| Non-partitioned: non-zero Partition | SP=0, P=5 | PC=1 | ErrAdminSQSValidation |
| In-range cursor accepted | SP=3, P=2 | PC=4 | nil (accepted) |
All four rejection paths are structurally distinct — the acceptance case verifies valid cursors aren't blocked.
TestAdminPeekQueue_HostileCursorBoundedRequest (lines 834-862): End-to-end regression with a 5-second context.WithTimeout. The hostile cursor (StartPartition=999, Partition=999) is wire-encoded and passed through the full AdminPeekQueue stack. The deadline is the key safety net: if the bounds check is ever removed, the test times out (context deadline exceeded) rather than returning ErrAdminSQSValidation, making the regression immediately visible in CI rather than hanging the suite.
One minor observation: the hostile cursor uses Generation: 1, but the queue's actual generation may differ (depending on what installPartitionedMetaForTest initializes). If they differ, the generation check fires before the bounds check. The test still passes because ErrAdminSQSValidation is returned either way — it just may not exercise the partition bounds path end-to-end when the generations don't match. This is acceptable because TestPreparePeekCursor_PartitionOutOfRange pins the bounds check in isolation, and the integration test pins the "hostile cursor → 400" property regardless of which guard fires.
preparePeekCursor docstring
The rationale comment (lines 258-266) is exactly the right level of detail — it names the attack vector, explains why the termination condition fails, and cites the PR and issue. Future readers won't be tempted to "simplify" the check.
Verdict
The fix closes the DoS vector correctly. The unit-level truth table + end-to-end deadline regression is the right pairing. No additional changes needed for this fix.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9bfaacfa4f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if !exists { | ||
| return nil, "", ErrAdminSQSNotFound | ||
| } | ||
| cursor, err = preparePeekCursor(cursor, meta, s.nextReceiveFanoutStart(name, meta.PartitionCount)) |
There was a problem hiding this comment.
Avoid advancing fanout rotation on peek cursor resumes
AdminPeekQueue always calls nextReceiveFanoutStart before preparePeekCursor, even when the request already carries a continuation cursor. Because this counter is shared with ReceiveMessage, every extra peek page advances the receive start partition and can reintroduce fixed-stride aliasing (e.g., 4 partitions and 3 peek pages between receives makes every receive start at the same partition). Under sustained backlog with small MaxNumberOfMessages, this can repeatedly prioritize the same partitions and starve others, so continuation requests should not mutate the shared receive-rotation counter.
Useful? React with 👍 / 👎.
Codex r2 P1 on PR #794: AdminPeekQueue called nextReceiveFanoutStart unconditionally - on every continuation page as well as the first. The counter is shared with ReceiveMessage's partition-rotation; bumping it on every peek page perturbs the stride ReceiveMessage reads for fairness. Under sustained backlog with small MaxNumberOfMessages, a fixed peek-to-receive interleave (e.g. PartitionCount=4 + 3 peek pages between receives) re-aligns every receive to the same partition, starving the others. Extract a peekStartPartition helper that returns 0 (no consultation needed) when either (a) the call carries a continuation cursor or (b) the queue is non-partitioned. The fanout counter advances only on the first page of a partitioned-queue peek walk. Caller audit: nextReceiveFanoutStart has two callers - the receive fanout (unchanged, still bumps per receive call) and now peekStartPartition (bumps only on first peek page). preparePeekCursor only USES startPartition when stamping a fresh cursor, so passing 0 on continuations is equivalent to passing the unused fanout value except it does not perturb the shared counter. Tests: TestAdminPeekQueue_ContinuationDoesNotBumpFanoutCounter exercises a partitioned FIFO queue, reads the counter value directly from receiveFanoutCounters after the first page, drives five continuation peek calls, and asserts the counter is unchanged. Regression of this property reintroduces the receive starvation Codex flagged.
|
@claude review Addressed Codex r2 P1 in 3e6841e:
|
|
Claude finished @bootjp's task in 1m 7s —— View job Code ReviewReviewing the r2 P1 fix (fanout-stride aliasing on continuation pages) from commit 3e6841e.
r2 P1 Fix: Fanout counter isolation on continuation pagesThe fix is correct and complete. What changed
func (s *SQSServer) peekStartPartition(queueName string, cursor *peekCursor, meta *sqsQueueMeta) uint32 {
if cursor != nil || meta.PartitionCount <= 1 {
return 0
}
return s.nextReceiveFanoutStart(queueName, meta.PartitionCount)
}Call site at line 236 — the decoded cursor (nil on first page, non-nil on continuation) is passed directly to Correctness analysisThree cases to verify:
The returned Test coverage
The choice of 5 continuation calls (more than PartitionCount=4) is intentional — any aliasing from a per-call bump would manifest as a multiple-of-4 drift, making a regression unambiguous. One minor observation: the 5 continuation calls may exhaust the queue (6 messages / Limit=2 = 3 pages), so loops SummaryThe fix is minimal, targeted, and well-reasoned. The No further changes needed for this fix. |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@adapter/sqs_admin_peek_test.go`:
- Around line 49-51: The current loop uses assertPeekRowMatchesIndexedBody(t, i,
row, "body-") which couples test to ordering; instead, extract the actual row
bodies from rows (e.g., map or slice of strings using the same accessor used in
assertPeekRowMatchesIndexedBody), build a set of actual bodies, and assert the
set equals the expected set {"body-0","body-1","body-2"} (or that it contains
each expected value) so order does not matter; apply the same change to the
other assertion block that currently iterates with index-based checks.
- Around line 870-872: The code does a type assertion counter, _ :=
v.(*atomic.Uint32) then immediately calls counter.Load(), which will panic if
the assertion failed or returned nil; update the test to check the assertion
success and non-nil before dereferencing: perform counter, ok :=
v.(*atomic.Uint32) and if !ok || counter == nil call the test failure helper
(e.g., t.Fatalf or require.Failf) with a clear message referencing v and the
expected *atomic.Uint32, otherwise call counter.Load(); update the usage sites
in the test that reference counter.Load() accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fbf13a1e-f945-42fc-ba13-fb7921795dea
📒 Files selected for processing (4)
adapter/dynamodb_admin.goadapter/sqs_admin_peek.goadapter/sqs_admin_peek_test.goadapter/sqs_messages.go
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3e6841e0c5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if cursor != nil || meta.PartitionCount <= 1 { | ||
| return 0 | ||
| } | ||
| return s.nextReceiveFanoutStart(queueName, meta.PartitionCount) |
There was a problem hiding this comment.
Use effective partition count for admin peek fanout
peekStartPartition keys fanout on meta.PartitionCount directly, but per-queue FIFO mode (FifoThroughputLimit=perQueue) intentionally collapses all traffic to partition 0 (see partitionFor / effectivePartitionCount). In that configuration, AdminPeekQueue still rotates/scans as if all partitions are live, so each peek can perform up to N-1 guaranteed-empty ScanAt calls (e.g., 31 extra scans at PartitionCount=32) and needlessly loads the leader. Aligning peek with effectivePartitionCount(meta) would preserve behavior while avoiding this avoidable read amplification.
Useful? React with 👍 / 👎.
Three r3 findings on PR #794: 1. Codex P2 (sqs_admin_peek.go:268). peekStartPartition and walkPeek's partition advance / termination keyed off meta.PartitionCount directly. In perQueue throughput mode (FifoThroughputLimit=perQueue) the data plane collapses every MessageGroupId to partition 0 (see effectivePartitionCount in sqs_keys_dispatch.go); peek still rotated over all N partitions, wasting up to N-1 guaranteed-empty ScanAt calls per peek (31 extra scans at PartitionCount=32). Switch to effectivePartitionCount(meta) in peekStartPartition, walkPeek's partition advance, and preparePeekCursor's bounds check. Caller audit: effectivePartitionCount is the same helper receiveMessage uses for its fanout loop (sqs_messages.go:885), so peek and receive now agree on partition usage. 2. CodeRabbit minor (test order-coupling). HappyPath asserted row ordering matched send order, but vis-index entries with the same visible_at millisecond tie-break on message_id (random hex) — a timing-sensitive flake. Replace ordered loop with set-based assertion via assertPeekRowsAsSet. 3. CodeRabbit minor (counter type assertion panic). The fanout counter test used `counter, _ := v.(*atomic.Uint32)` then counter.Load(), which panics if the assertion fails. Extract loadFanoutCounter helper that fails the test explicitly on bad type / nil rather than panicking. Tests: TestAdminPeekQueue_PerQueueFIFOCollapsesToOnePartition pins the perQueue functional contract end-to-end. TestPreparePeekCursor_PerQueueCollapse pins the cursor codec's effective-bounds validation (a Partition=2 cursor against a perQueue PartitionCount=4 queue is rejected — even though the raw PartitionCount would otherwise allow it).
|
@claude review Addressed r3 in fc35ee5:
|
|
Claude finished @bootjp's task in 2m 17s —— View job Code ReviewReviewing r3 changes from commit
r3 Fix: Align peek with
|
| Sub-case | effective | cursor | Result |
|---|---|---|---|
| non-partitioned, SP=1 | 1 | SP=1 >= 1 | rejected ✓ |
| non-partitioned, P=5 | 1 | P=5 >= 1 | rejected ✓ |
| partitioned, SP=99 | 4 | SP=99 >= 4 | rejected ✓ |
| partitioned, in-range | 4 | SP=3, P=2 | accepted ✓ |
What's clean
- The three-site change (peekStartPartition / preparePeekCursor / walkPeek) is consistent — all three now agree on
effectivePartitionCountas the authoritative bound. A partial fix (e.g., only updating walkPeek's modulo but not preparePeekCursor's validation) would have left an exploitable gap. - The
TestPreparePeekCursor_PerQueueCollapse"fresh cursor" sub-case demonstrates that the caller-suppliedstartPartitionis safely ignored for perQueue queues — the partition is pinned to 0 by theeffective > 1guard, not by trusting the caller to always pass 0. - Commit message explicitly lists all three findings and their sources (Codex P2, two CodeRabbit minors), making the PR history self-documenting.
No further changes needed. This PR is in good shape — all three rounds of review findings have been addressed.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
adapter/sqs_admin_peek.go (1)
35-45:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAdd JSON tags to
AdminPeekedMessage.This is the wire model, but it currently marshals as
MessageID,BodyTruncated, etc., andAttributeswill serialize asnullinstead of being omitted. That breaks the intended JSON shape and makes theprojectPeekedAttributesomitemptycomment ineffective.Suggested fix
type AdminPeekedMessage struct { - MessageID string - Body string // truncated per opts.BodyMaxBytes - BodyTruncated bool // true when Body was cut - BodyOriginalSize int64 // bytes in the original body, for display - SentTimestamp time.Time // SQS SentTimestamp - ReceiveCount int32 // ApproximateReceiveCount - GroupID string // FIFO MessageGroupId, empty for standard - DeduplicationID string // FIFO MessageDeduplicationId, empty for standard - Attributes map[string]AdminPeekedAttribute // typed SQS message attributes + MessageID string `json:"message_id"` + Body string `json:"body"` // truncated per opts.BodyMaxBytes + BodyTruncated bool `json:"body_truncated"` + BodyOriginalSize int64 `json:"body_original_size"` + SentTimestamp time.Time `json:"sent_timestamp"` + ReceiveCount int32 `json:"receive_count"` + GroupID string `json:"group_id,omitempty"` + DeduplicationID string `json:"deduplication_id,omitempty"` + Attributes map[string]AdminPeekedAttribute `json:"attributes,omitempty"` }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@adapter/sqs_admin_peek.go` around lines 35 - 45, The AdminPeekedMessage struct is missing JSON tags, causing fields like MessageID and Attributes to serialize incorrectly; update the struct (AdminPeekedMessage) to add explicit `json:"..."` tags for each field (e.g., MessageID, Body, BodyTruncated, BodyOriginalSize, SentTimestamp, ReceiveCount, GroupID, DeduplicationID) and mark optional fields with `omitempty` (notably Attributes should be `json:"attributes,omitempty"` so it is omitted when nil); ensure naming uses lowerCamelCase or the desired wire keys to match the intended JSON shape and keep types and comments unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@adapter/sqs_admin_peek.go`:
- Around line 35-45: The AdminPeekedMessage struct is missing JSON tags, causing
fields like MessageID and Attributes to serialize incorrectly; update the struct
(AdminPeekedMessage) to add explicit `json:"..."` tags for each field (e.g.,
MessageID, Body, BodyTruncated, BodyOriginalSize, SentTimestamp, ReceiveCount,
GroupID, DeduplicationID) and mark optional fields with `omitempty` (notably
Attributes should be `json:"attributes,omitempty"` so it is omitted when nil);
ensure naming uses lowerCamelCase or the desired wire keys to match the intended
JSON shape and keep types and comments unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 450a6342-93cc-4022-9072-b744f0a1df17
📒 Files selected for processing (2)
adapter/sqs_admin_peek.goadapter/sqs_admin_peek_test.go
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fc35ee5174
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if len(lastKey) > 0 { | ||
| start = nextScanCursorAfter(lastKey) | ||
| if end != nil && bytes.Compare(start, end) > 0 { |
There was a problem hiding this comment.
Validate cursor LastKey stays in the queue scan range
A continuation cursor with a valid gen/partition but a forged k can force scans to start outside this queue’s visibility-index prefix, because lastKey is trusted directly as the next start key and only checked against the upper bound. In that case ScanAt can walk unrelated key ranges and trigger up to Limit extra GetAt misses per call, so repeated requests become a read-amplification path against the leader. The cursor should be rejected as ErrAdminSQSValidation (or start clamped) when lastKey is below the partition’s visibility start prefix.
Useful? React with 👍 / 👎.
Two r4 findings on PR #794: 1. Codex P2 (sqs_admin_peek.go:401). preparePeekCursor only validated Generation and partition indices; cursor.LastKey was trusted directly as the ScanAt start key, checked only against the upper bound. A forged LastKey below the queue's visibility-index prefix would let ScanAt walk unrelated key ranges, triggering up to Limit extra GetAt misses per call - read amplification against the leader. preparePeekCursor now rejects with ErrAdminSQSValidation when cursor.LastKey is not prefixed by sqsMsgVisPrefixForQueueDispatch(meta, queueName, Partition, Generation). The check catches three classes of forgery: completely arbitrary bytes (e.g. "aaaa"), a valid LastKey from a different queue (prefix encodes queue name), and a valid LastKey from a different partition of the same queue. preparePeekCursor's signature gains queueName; the only existing caller (AdminPeekQueue) and the existing unit tests were updated. 2. CodeRabbit Major (sqs_admin_peek.go:35-45, outside diff). AdminPeekedMessage lacked JSON tags, so json.Marshal emitted Go-style PascalCase ("MessageID", "BodyTruncated", ...) instead of the snake_case wire shape the design doc §3.5 specifies (the SPA's client adapter expects the spec'd form). Also empty Attributes / GroupID / DeduplicationID would serialize as "null" / "" instead of being omitted. JSON tags added with appropriate omitempty. Caller audit (semantic-change rule): preparePeekCursor signature changed from (cursor, meta, startPartition) to (cursor, meta, queueName, startPartition). One non-test caller (AdminPeekQueue at sqs_admin_peek.go:240) and 10 test call sites updated. The new queueName parameter is purely additive validation; it does not alter the cursor's stamping or generation-mismatch behavior. Tests: TestAdminPeekedMessage_JSONWireFormat / TestAdminPeekedMessage_JSONOmitsEmptyAttributes pin the JSON wire shape. TestPreparePeekCursor_PartitionOutOfRange gains three new sub-cases (forged LastKey, foreign-queue LastKey, mismatched- partition LastKey). The test function was rewritten as a table-driven loop so it stays under the cyclop budget.
|
@claude review Addressed r4 in ddb4e6e:
|
|
Claude finished @bootjp's task in 2m 37s —— View job Code ReviewReviewing r4 changes from commit
r4: LastKey prefix validation + JSON tagsBoth fixes are correct and complete. Detailed assessment below. 1. Codex P2 —
|
| Sub-case | Attack vector |
|---|---|
[]byte("aaaa") |
Arbitrary bytes — not a valid vis-index prefix at all |
foreignKey (queue-X prefix) |
Valid vis-index key but for a different queue |
mismatchedKey (queue-A, partition 2, but cursor.Partition=0) |
Correct queue, wrong partition |
These cover the full attack surface: arbitrary, wrong-queue, and wrong-partition forgeries. The foreignKey and mismatchedKey are constructed with append(sqsMsgVisPrefixForQueueDispatch(...), 'X'), which correctly creates a key that is inside the foreign prefix but fails the expected-queue prefix check.
Minor observation — no unit test for a valid LastKey on the acceptance path. The existing TestPreparePeekCursor_PartitionOutOfRange acceptance sub-case passes cursor.LastKey = nil, so it doesn't exercise bytes.HasPrefix returning true. The integration tests (TestAdminPeekQueue_CursorRoundTrip, TestAdminPeekQueue_PartitionedFIFO_Pagination, TestAdminPeekQueue_ContinuationDoesNotBumpFanoutCounter) all use real continuation cursors from walkPeekPartition, so the acceptance path is covered end-to-end. Not blocking — the integration coverage is sufficient.
2. CodeRabbit Major — AdminPeekedMessage JSON tags
adapter/sqs_admin_peek.go:39-49
type AdminPeekedMessage struct {
MessageID string `json:"message_id"`
Body string `json:"body"`
BodyTruncated bool `json:"body_truncated"`
BodyOriginalSize int64 `json:"body_original_size"`
SentTimestamp time.Time `json:"sent_timestamp"`
ReceiveCount int32 `json:"receive_count"`
GroupID string `json:"group_id,omitempty"`
DeduplicationID string `json:"deduplication_id,omitempty"`
Attributes map[string]AdminPeekedAttribute `json:"attributes,omitempty"`
}omitempty placement is correct:
BodyTruncated bool— noomitempty: always emits"body_truncated": falsewhen not truncated. Correct — the SPA needs the explicit boolean rather than inferring absence as false.BodyOriginalSize int64— noomitempty: always emits, including0. Correct — the SPA should always display the original size, even for an empty-body message (edge case but well-defined).ReceiveCount int32— noomitempty: a count of0(never received) is meaningful, not an absence.GroupID,DeduplicationID,Attributes— withomitempty: empty strings and nil maps are correctly suppressed for standard queues.Attributesuses a map type, soomitemptydrops it when nil (whichprojectPeekedAttributesreturns for an empty input) rather than emitting"attributes": {}or"attributes": null.
TestAdminPeekedMessage_JSONWireFormat (line 735) is the right kind of wire-contract test: it checks for the presence of snake_case keys AND explicitly asserts the absence of PascalCase keys. The nested AdminPeekedAttribute snake_case check (line 774-779) pins the attribute shape in the same test, avoiding a separate fixture.
TestAdminPeekedMessage_JSONOmitsEmptyAttributes (line 786) correctly tests the three omitempty fields together. The assertion checks for absence by key, not for nil/zero, which is the right level — it matches what the SPA actually receives.
What's clean across both fixes
preparePeekCursordocstring (lines 279-318): the addedqueueNameparagraph at lines 311-318 explains both why the field was added and what attack it closes, with a PR reference. This is exactly the right level of detail for a security-motivated parameter addition.- Comment-in-code audit trail: both the
AdminPeekedMessagestruct comment and thepreparePeekCursordocstring cite the reviewer/round that caught the issue (CodeRabbit r4, Codex r4 P2). Saves future reviewers from having to dig through git log. - No behaviour change to the receive path:
preparePeekCursoris only called fromAdminPeekQueue; the newqueueNameparameter doesn't touch the data plane.
Verdict
All four rounds of review findings have been addressed. The PR is in good shape for merge. No further changes needed.

Summary
Phase 3 of
docs/design/2026_05_16_proposed_admin_purge_queue.md: backendAdminPeekQueuemethod on*adapter.SQSServer.opts.Limitcurrently-visible messages (visible_at <= now); pure read, no receive-count bump, no receipt-handle minting, no visibility-timer change.receiveFanoutCountersfor fairness.Generation,StartPartition,Partition, andLastKey; stale-generation cursors are rejected so a purge between pages forces the SPA to refresh from the front.AdminRole.canRead()— peek requires non-zero role; read-only principals can triage, purge stays oncanWrite().goconstregression insqs_messages.go(the validator switch's inline"String"/"Number"literals) fixed by extractingsqsAttributeBaseTypeString/sqsAttributeBaseTypeNumberalongside the existingsqsAttributeBaseTypeBinary.Caller audit (semantic-change rule): no existing function signatures changed.
canReadis a new method;AdminPeekQueueis a new entrypoint. The validator switch's constant extraction is a literal-for-named rename (byte-identical wire behaviour).Behavior change
AdminPeekQueueadmin method. Not yet wired to an HTTP handler (Phase 4) — this PR only adds the adapter surface.AdminPurgeQueue, orAdminDescribeQueue.Risk
Low. Pure-read entry; the only writes I touch are the constant rename in
sqs_messages.go(byte-identical to the prior inline literals). The cursor codec is hard-capped at 512 bytes after encoding; oversize cursors surface asErrAdminSQSValidationfrom the decoder.Self-review (5 passes)
sqs_messages.gois a literal-for-named rename with no behaviour change.isVerifiedSQSLeader. SinglenextTxnReadTSthreaded intoloadQueueMetaAt,ScanAt, and per-rowGetAt, so the page is a consistent MVCC snapshot. The cursor's stored Generation is checked against the live meta on every continuation page — a purge between pages producesErrAdminSQSValidationrather than rows from a purged generation.O(opts.Limit)ScanAtround-trips + per-rowGetAt. Hard caps:Limit <= 100, body <= 256 KiB. Cursor is bounded at 512 bytes. No hot-path allocations.Generation; mismatch surfaces asErrAdminSQSValidation. Partition rotation uses the samereceiveFanoutCountersthe receive path uses, so peek's partition fairness matches the data plane's. Data records are read viasqsMsgDataKeyDispatch, so partitioned FIFO data keys are routed to the same partition the vis-index entry was found under.go test -raceclean.Test plan
go test -race -count=1 ./adapter/...(targeted:TestAdminPeekQueue*,TestPeekCursorCodec*,TestClampPeek*,TestPreparePeekCursor*,TestEncodePeekCursor*,TestProjectPeekedAttributes*,TestAdminRole_CanRead,TestAdminPurge*,TestAdminQueueSummary*,TestSQSServer_PartitionedFIFO*,TestSQSServer_Send*)make lintclean on./adapter/...(0 issues)Out of scope (follow-ups)
bucketActionAdminPeek+resolveActionConfigexplicit case +*adminPeekThrottledError+*PeekThrottledError). Leader-only +Limit <= 100already bound the steady-state cost; the dedicated bucket adds a per-queue cap that should land alongside the SPA wiring so the metric has a real consumer.GET /admin/api/v1/sqs/queues/{name}/messages, bridge inmain_admin.go, integration tests.principalForReadSensitiveliveRoleStorere-check (design doc Goal 8): blocked on the widerRoleStoreplumbing — neitherAdminPurgeQueuenorAdminDeleteQueuedo a live re-check today; peek will inherit that pattern when the broader work lands.Summary by CodeRabbit