Skip to content

v3.2.2#299

Open
jkool702 wants to merge 15 commits into
mainfrom
NEW
Open

v3.2.2#299
jkool702 wants to merge 15 commits into
mainfrom
NEW

Conversation

@jkool702
Copy link
Copy Markdown
Owner

No description provided.

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @jkool702, your pull request is larger than the review limit of 150000 diff characters

@jkool702
Copy link
Copy Markdown
Owner Author

/gemini summary

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the lock-free batch claiming mechanism in forkrun_ring.c by splitting ring_claim_main into a side-effect-free helper do_lockfree_claim and a wrapper handling Bash variable bindings. It also introduces speculative claim limits, improves escrow draining, updates tokenization to pass offsets explicitly, and adds several C plugin unit tests. A critical issue was identified in the review where retry and poison metadata variables (RING_NUM_KILLS and RING_POISONED) are not reset for fresh batches, leading to state leakage from previous poisoned batches.

Comment thread forkrun_ring.c
Comment on lines +4422 to 4440
if (batch.num_kills > 0) {
snprintf(buf, sizeof(buf), "%u", batch.num_kills);
bind_variable("RING_NUM_KILLS", buf, 0);

if (local_state->numa_enabled) {
worker_last_major = local_state->major_ring[my_read_idx & RING_MASK];
worker_last_minor =
local_state->minor_ring[my_read_idx & RING_MASK] & ~FLAG_MAJOR_EOF;
u64toa(batch.idx, buf);
bind_variable("RING_BATCH_IDX", buf, 0);

if (local_state->minor_ring[(my_read_idx + claim_count - 1) & RING_MASK] &
FLAG_MAJOR_EOF) {
worker_last_minor |= FLAG_MAJOR_EOF;
int limit = 3; // Default to 3 retries
const char *s_lim = get_string_value("FORKRUN_RETRY_LIMIT");
if (s_lim) limit = atoi(s_lim);

// limit < 0 → infinite retries (never poison)
// limit >= 0 → poison when kill count reaches the limit
if (limit >= 0 && batch.num_kills >= (uint32_t)limit) {
bind_variable("RING_POISONED", "1", 0);
} else {
bind_variable("RING_POISONED", "0", 0);
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

For fresh batches (batch.num_kills == 0), the retry and poison metadata variables (RING_NUM_KILLS and RING_POISONED) are not updated and will retain their values from the previous batch. If a previous batch was poisoned (RING_POISONED="1"), a subsequent fresh batch that fails just once will be incorrectly treated as poisoned by the Bash layer, leading to premature aborts. Resetting these variables to "0" for fresh batches prevents this critical state leakage.

  if (batch.num_kills > 0) {
    snprintf(buf, sizeof(buf), "%u", batch.num_kills);
    bind_variable("RING_NUM_KILLS", buf, 0);

    u64toa(batch.idx, buf);
    bind_variable("RING_BATCH_IDX", buf, 0);

    int limit = 3; // Default to 3 retries
    const char *s_lim = get_string_value("FORKRUN_RETRY_LIMIT");
    if (s_lim) limit = atoi(s_lim);

    // limit < 0  → infinite retries (never poison)
    // limit >= 0 → poison when kill count reaches the limit
    if (limit >= 0 && batch.num_kills >= (uint32_t)limit) {
      bind_variable("RING_POISONED", "1", 0);
    } else {
      bind_variable("RING_POISONED", "0", 0);
    }
  } else {
    bind_variable("RING_NUM_KILLS", "0", 0);
    bind_variable("RING_POISONED", "0", 0);
  }

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 652e4b9062

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread forkrun_ring.c
// --- Export retry/poison metadata only for recycled (escrow) batches ---
// For fresh batches (num_kills == 0) these variables retain their previous
// values; the Bash layer only inspects them after a non-zero retry count.
if (batch.num_kills > 0) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reset poison metadata for non-retry escrow claims

RING_NUM_KILLS/RING_POISONED are now only updated when batch.num_kills > 0, so recycled batches with num_kills == 0 keep whatever values were set by a previous claim. In a sequence where one recycled batch is poisoned (RING_POISONED=1) and the next recycled batch is healthy (num_kills=0), callers that inspect RING_POISONED on each claim can incorrectly skip valid work because the stale poison flag is never cleared here.

Useful? React with 👍 / 👎.

…25371208

🤖 Auto-Build: Update Ring Loadables
@sonarqubecloud
Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
1 Security Hotspot
76.9% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

Gemini encountered an error creating the summary. You can try again by commenting /gemini summary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant