fix(synthbench): refuse display-label model recs from --best-model-for (sy-kh3) by openclaw-dv · Pull Request #521 · DataViking-Tech/SynthPanel

openclaw-dv · 2026-05-23T20:10:59Z

Summary

SynthBench product/ensemble leaderboard rows can carry a human-readable
display label (e.g. "SynthPanel (Gemini Flash Lite)") in their model
field rather than a runnable provider model id. --best-model-for stamped
that label straight onto --model, deferring the failure to call time as
an opaque provider "not a valid model ID" error (#519).

Fix

is_runnable_model_id() — new structural check rejecting whitespace / paren display labels while accepting bare openai-compat / local ids.
recommend() now sets a runnable flag on Recommendation, and only adopts a config_id-derived base model when it's recognized (known provider prefix or registered alias) — never a hash fragment from a hyphenated config_id split.
_apply_best_model_for() refuses a non-runnable recommendation with an actionable message and falls back to the existing --model/default instead of stamping the label; --dry-run surfaces this upfront.

Test plan

New unit tests pin the runnable/non-runnable discriminator.
--best-model-for + --dry-run flow now exits clean on a display-label rec instead of producing a downstream provider error.
GitHub CI runs the full suite on this PR.

References

Bead: sy-kh3
Closes --best-model-for should not pass SynthBench display names as provider model IDs #519
MR: sy-wisp-3kp
Polecat: garnet
Branch: polecat/garnet-mpiryq6k

…r (sy-kh3) SynthBench product/ensemble leaderboard rows can carry a human-readable display label (e.g. "SynthPanel (Gemini Flash Lite)") in their model field rather than a runnable provider model id. --best-model-for stamped that label straight onto --model, deferring the failure to call time as an opaque provider "not a valid model ID" error (gh-519). - Add is_runnable_model_id(): structural check rejecting whitespace/paren display labels while accepting bare openai-compat/local ids. - recommend() now sets a runnable flag on Recommendation, and only adopts a config_id-derived base model when it's recognized (known provider prefix or registered alias) — never a hash fragment from a hyphenated config_id split. - _apply_best_model_for() refuses a non-runnable recommendation with an actionable message and falls back to the existing --model/default instead of stamping the label; --dry-run surfaces this upfront. Closes #519. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Collapse multi-line calls that fit the project's line length so `ruff format --check` passes in CI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cloudflare-workers-and-pages · 2026-05-23T20:25:01Z

Deploying synthpanel with Cloudflare Pages

Latest commit:	`0839f8e`
Status:	✅ Deploy successful!
Preview URL:	https://4e67da3e.synthpanel.pages.dev
Branch Preview URL:	https://polecat-garnet-mpiryq6k.synthpanel.pages.dev

View logs

openclaw-dv requested a review from the-data-viking as a code owner May 23, 2026 20:11

openclaw-dv and others added 2 commits May 23, 2026 15:24

style: ruff format test_best_model_for_cli.py (sy-kh3)

0839f8e

Collapse multi-line calls that fit the project's line length so `ruff format --check` passes in CI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

openclaw-dv force-pushed the polecat/garnet-mpiryq6k branch from df11df8 to 0839f8e Compare May 23, 2026 20:25

openclaw-dv added the semver:patch Bump patch version on merge label May 23, 2026

openclaw-dv merged commit 33b38d0 into main May 23, 2026
19 checks passed

openclaw-dv deleted the polecat/garnet-mpiryq6k branch May 23, 2026 20:59

This was referenced May 23, 2026

--best-model-for should not pass SynthBench display names as provider model IDs #519

Closed

fix(synthbench): substitute runnable model_id from leaderboard rows (sy-i7a) #529

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(synthbench): refuse display-label model recs from --best-model-for (sy-kh3)#521

fix(synthbench): refuse display-label model recs from --best-model-for (sy-kh3)#521
openclaw-dv merged 2 commits into
mainfrom
polecat/garnet-mpiryq6k

openclaw-dv commented May 23, 2026

Uh oh!

cloudflare-workers-and-pages Bot commented May 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

openclaw-dv commented May 23, 2026

Summary

Fix

Test plan

References

Uh oh!

cloudflare-workers-and-pages Bot commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying synthpanel with Cloudflare Pages

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cloudflare-workers-and-pages Bot commented May 23, 2026 •

edited

Loading