cuopt-agent: multi-objective supply-vs-cost what-if + cost-cap eval by cafzal · Pull Request #157 · NVIDIA/cuopt-examples

cafzal · 2026-06-17T21:58:59Z

What

A new what-if scenario (scenario_4.md) and eval case (max_supply_4) for the cuopt-agent's max-supply model — bringing multi-objective tradeoff exploration to the agent.

Why

The agent ships a multi-period MILP whose cost data (item_costs.csv, resource_costs.csv) is unused, and it only ever runs single-objective; the cuopt-multi-objective-exploration skill (NVIDIA/cuopt#1355) is available but no scenario exercises it. scenario_4 activates it as a supply-vs-cost tradeoff with no agreed weighting, framed to test judgment rather than prescribe the method (it surfaces the MILP-has-no-duals and 10000:1-weight traps without naming ε-constraint). max_supply_4 adds a numeric check graded by the existing cuopt_objective evaluator: cap total cost at 9,149.8 and maximize supply, ground truth 2,660,000, computed on cuOpt (Tesla T4).

User testing

Run on cuOpt (Tesla T4):

The reference solve traces a well-posed supply-vs-cost frontier — supply buys in at ~288 weighted-units/$, then collapses to ~7/$ past the FG1-saturation knee (full sweep below).
Before/after, same LLM: without the skill the agent collapses to a self-weighted blend (maximize supply − λ·cost); with scenario_4 + the skill it traces the frontier by ε-constraint, differences adjacent points for the rate (correctly noting a MILP has no duals), reports interpretable units, flags the knee, and leaves the pick to finance — and its solve hit the eval ground truth.

Reference frontier — max weighted supply vs. cost cap (cuOpt, Tesla T4); unconstrained max 3,450,061 at cost 15,249.6

total cost ≤ C	max weighted obj	FG1	FG2	Δobj/Δ$
3,812	1,140,000	114	0	—
5,168	1,530,000	153	0	288
6,523	1,920,000	192	0	288
7,879	2,300,000	230	0	280
9,235	2,690,000	269	0	288
10,590	3,020,000	302	0	243
11,946	3,300,001	330	1	207
13,301	3,440,006	344	6	103
14,657	3,450,063	345	63	7
16,012	3,460,062	346	62	7

Toy sample data — exercises the multi-objective method on the agent, not a planning study.

…al case Signed-off-by: cafzal <cameron.afzal@gmail.com>

cafzal · 2026-06-18T17:41:51Z

@ramakrishnap-nv cuopt-agent what-if + eval – activates the multi-objective skill in the agent (supply-vs-cost, no agreed weighting). GPU-validated; before/after in the description. Ready when you have a cycle.

Adds a fifth eval to the `cuopt-multi-objective-exploration` skill — `multiobj-explore-eval-005-latent-objective` — covering the boundary the existing four don't: a problem stated with a **single** objective while a **second objective sits latent in the data**, unstated. The current evals all hand the agent both objectives (001 interpret, 002 explore, 004 dual-as-slope) or are explicitly single-objective (003 decoy). None test recognizing a *latent* objective. This one grades whether the skill makes the agent surface the latent cost objective and trace the supply-vs-cost frontier — rather than optimizing the stated objective alone or silently folding cost into a self-chosen weighted blend (`maximize supply − λ·cost`). It brackets the skill's activation boundary opposite the 003 decoy. Behavioral eval (`expected_script: null`, LLM-graded on the behavior list), same house style as 001/002/004; `validate_skills.sh` picks up the new array entry and the signature / `BENCHMARK.md` / skill-card regenerate via NVSkills-Eval. The latent-objective shape is the max-supply supply-vs-cost case validated on cuOpt (Tesla T4) in NVIDIA/cuopt-examples#157. Authors: - Cameron Afzal (https://github.com/cafzal) Approvers: - Ramakrishna Prabhu (https://github.com/ramakrishnap-nv) URL: #1442

…olve Independent re-solve (CBC, gap 0) shows the cost-capped optimum is 2,670,000 (267 FG1 units at total cost exactly 9,149.80; LP bound 2,676,052), not 2,660,000 — the recorded value sits inside model.py's default 1% MIP gap. Update the expected answer and have the question demand a zero-gap solve so the eval is deterministic. Also note the budget sweep samples a step function rather than a full curve.

cuopt-agent: add multi-objective supply-vs-cost what-if + cost-cap ev…

9e81204

…al case Signed-off-by: cafzal <cameron.afzal@gmail.com>

This was referenced Jun 18, 2026

cuopt-agent: add duals-interpretation guidance to the debugging skill #159

Open

Add latent-objective recognition eval to multi-objective skill NVIDIA/cuopt#1442

Merged

cafzal marked this pull request as ready for review June 18, 2026 17:26

cafzal force-pushed the agent-cost-tradeoff branch from a4484fa to 9be7302 Compare July 2, 2026 03:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cuopt-agent: multi-objective supply-vs-cost what-if + cost-cap eval#157

cuopt-agent: multi-objective supply-vs-cost what-if + cost-cap eval#157
cafzal wants to merge 2 commits into
NVIDIA:mainfrom
cafzal:agent-cost-tradeoff

cafzal commented Jun 17, 2026 •

edited

Loading

Uh oh!

cafzal commented Jun 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

cafzal commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

User testing

Uh oh!

cafzal commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cafzal commented Jun 17, 2026 •

edited

Loading

cafzal commented Jun 18, 2026 •

edited

Loading