Skip to content

Fail fast on structurally malformed LLM kernel responses#138

Open
jiannanWang wants to merge 1 commit into
mainfrom
jiannanWang/fail-fast-malformed-kernel
Open

Fail fast on structurally malformed LLM kernel responses#138
jiannanWang wants to merge 1 commit into
mainfrom
jiannanWang/fail-fast-malformed-kernel

Conversation

@jiannanWang
Copy link
Copy Markdown
Contributor

Problem

When the LLM returns a response that fails Python syntax or omits the required kernel_function definition, the worker still writes the file, runs the test subprocess (~5s), gets a syntax error, then re-prompts the LLM up to 3 more times — burning ~20s and 3 extra LLM API calls per dead candidate.

This produces a long iteration tail: each kernel-authoring round can stall for the entire 3-round retry budget waiting on a doomed candidate. At low fanout it's annoying; at multi-LLM × multi-bottleneck × samples > 1 the dead-candidate rate is non-trivial and the wasted budget scales linearly with worker count.

A typical trigger is an LLM response truncated mid–code-block.

Fix

Add an ast.parse precheck (_validate_kernel_candidate) that short-circuits with a clear error message when:

  • the extracted text is empty / whitespace-only,
  • Python parse fails (line number + parser message included), or
  • no top-level kernel_function is defined.

Hooked in at two points:

  • Inside _refine_kernel, so refinement aborts immediately when the LLM produces unparseable text.
  • At the entry of verify / _refine_until_pass, so a malformed initial kernel doesn't consume the 3-round retry budget.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 26, 2026
When the LLM returns code that fails Python syntax or omits the
required ``kernel_function`` definition, the worker would still write
the file, run the test subprocess (~5s), get a syntax error, then
re-prompt the LLM up to 3 more times for a total of ~20s wasted per
dead candidate.

Add an ``ast.parse`` precheck (``_validate_kernel_candidate``) that
short-circuits with a clear error message when:

  * the extracted text is empty / whitespace-only,
  * Python parse fails (with line / message), or
  * no top-level ``kernel_function`` is defined.

Hook the validator into both ``_refine_kernel`` (so refinement aborts
immediately when the LLM produces unparseable text) and the entry
points of ``verify`` / ``_refine_until_pass`` (so a malformed initial
kernel doesn't get a 3-round retry budget).

Behavior preserved for valid kernels: every existing happy path is
unchanged. Saves ~20s per dead candidate, which scales linearly with
fanout — meaningful at multi-LLM × multi-bottleneck × samples > 1
where dead-candidate rate is non-trivial.

Test plan:
- Unit test asserts a ``def kernel_function(...)``-less candidate
  returns a clear malformed-reason instead of looping.
- Existing ``pytest tests/`` suite passes.
@jiannanWang jiannanWang force-pushed the jiannanWang/fail-fast-malformed-kernel branch from ee9a530 to d7591ed Compare May 26, 2026 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant