Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 0 additions & 12 deletions .claude-plugin/marketplace.json
Original file line number Diff line number Diff line change
Expand Up @@ -56,24 +56,12 @@
"skills": "./",
"description": "Trace and interpret the Pareto frontier across competing objectives using repeated single-objective cuOpt solves (weighted-sum and ε-constraint)."
},
{
"name": "cuopt-routing-formulation",
"source": "./skills/cuopt-routing-formulation",
"skills": "./",
"description": "Vehicle routing (VRP, TSP, PDP) — problem types and data requirements. Domain concepts; no API or interface."
},
{
"name": "cuopt-routing-api-python",
"source": "./skills/cuopt-routing-api-python",
"skills": "./",
"description": "Vehicle routing (VRP, TSP, PDP) with cuOpt — Python API only. Use when the user is building or solving routing in Python."
},
{
"name": "cuopt-server-common",
"source": "./skills/cuopt-server-common",
"skills": "./",
"description": "cuOpt REST server — what it does and how requests flow. Domain concepts; no deploy or client code."
},
{
"name": "cuopt-server-api-python",
"source": "./skills/cuopt-server-api-python",
Expand Down
2 changes: 0 additions & 2 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,6 @@ AI agent skills for NVIDIA cuOpt optimization engine. Skills live in **`skills/`
### Common (concepts only; no API code)
- `skills/cuopt-numerical-optimization-formulation/` — LP / MILP / QP: concepts + problem parsing + common formulation patterns
- `skills/cuopt-multi-objective-exploration/` — Multi-objective: trace + interpret the Pareto frontier across competing objectives (ε-constraint / weighted-sum over repeated cuOpt solves)
- `skills/cuopt-routing-formulation/` — Routing: VRP, TSP, PDP (problem types, data)
- `skills/cuopt-server-common/` — Server: capabilities, workflow

### Installation
- `skills/cuopt-install/` — User install for Python, C, and server (pip, conda, Docker, verification). For building cuOpt from source, see `skills/cuopt-developer/`.
Expand Down
10 changes: 5 additions & 5 deletions skills/cuopt-multi-objective-exploration/BENCHMARK.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ This benchmark summarizes 3-Tier Evaluation from NVSkills-Eval results for the s
## Evaluation Summary

- Skill: `cuopt-multi-objective-exploration`
- Evaluation date: 2026-06-26
- Evaluation date: 2026-06-29
- NVSkills-Eval profile: `external`
- Environment: `astra-sandbox`
- Dataset: 5 evaluation tasks
Expand Down Expand Up @@ -55,10 +55,10 @@ Task composition is derived from the evaluation dataset when possible. Entries w
| Dimension | Num | `claude-code` | `codex` |
|---|---:|---:|---:|
| Security | 5 | 100% (+0%) | 100% (+0%) |
| Correctness | 5 | 90% (+37%) | 76% (+24%) |
| Discoverability | 5 | 80% (+60%) | 71% (+41%) |
| Effectiveness | 5 | 91% (+20%) | 68% (+10%) |
| Efficiency | 5 | 80% (+40%) | 71% (+30%) |
| Correctness | 5 | 90% (+0%) | 87% (-8%) |
| Discoverability | 5 | 78% (-2%) | 73% (-22%) |
| Effectiveness | 5 | 85% (-6%) | 85% (+2%) |
| Efficiency | 5 | 75% (-3%) | 71% (-21%) |

Score values show skill-assisted performance. Values in parentheses show uplift versus the no-skill baseline when baseline data is available.

Expand Down
2 changes: 1 addition & 1 deletion skills/cuopt-multi-objective-exploration/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,4 +135,4 @@ Practical notes:
This skill is solver- and interface-agnostic. The per-solve mechanics — building the objective, adding the ε constraints, passing a warm start, reading status — live in the API skills:

- `cuopt-numerical-optimization-api-python` / `-api-c` / `-api-cli` — LP, MILP, QP solves.
- `cuopt-routing-formulation` + `cuopt-routing-api-python` — the same frontier workflow applies to routing tradeoffs (distance vs. vehicles vs. time).
- `cuopt-routing-api-python` — the same frontier workflow applies to routing tradeoffs (distance vs. vehicles vs. time).
26 changes: 13 additions & 13 deletions skills/cuopt-multi-objective-exploration/skill-card.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,16 +7,16 @@ This skill is ready for commercial/non-commercial use. <br>
NVIDIA <br>

### License/Terms of Use: <br>
Apache 2.0 <br>
Apache-2.0 <br>
## Use Case: <br>
Developers and engineers exploring tradeoffs between competing objectives in optimization problems, tracing Pareto frontiers to make informed multi-objective decisions rather than accepting a single implicit weighting. <br>
Developers and engineers exploring tradeoffs between competing objectives in optimization problems, using cuOpt to trace the Pareto frontier and interpret exchange rates rather than collapsing to a single weighted answer. <br>

### Deployment Geography for Use: <br>
Global <br>

## Requirements / Dependencies: <br>
**Requires API Key or External Credential:** [No] <br>
**Credential Type(s):** [None] <br>
**Requires API Key or External Credential:** [Not Specified] <br>
**Credential Type(s):** [None identified] <br>
Comment on lines +18 to +19

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Make the credential metadata consistent.

[Not Specified] conflicts with Credential Type(s): [None identified], so consumers cannot tell whether this skill is known to require no secrets or whether the requirement is still unknown. Pick one state and represent both fields consistently.

One possible fix if the skill does not require credentials
-**Requires API Key or External Credential:** [Not Specified] <br>
+**Requires API Key or External Credential:** [No] <br>
 **Credential Type(s):** [None identified] <br>  
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
**Requires API Key or External Credential:** [Not Specified] <br>
**Credential Type(s):** [None identified] <br>
**Requires API Key or External Credential:** [No] <br>
**Credential Type(s):** [None identified] <br>
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skills/cuopt-multi-objective-exploration/skill-card.md` around lines 18 - 19,
The credential metadata in the skill card is inconsistent because one field says
the requirement is unknown while the other says none are needed. Update the
skill-card metadata so the “Requires API Key or External Credential” and
“Credential Type(s)” values both reflect the same state, using the relevant
skill card fields to indicate either no credentials are required or that the
requirement is unspecified.


Do not include secrets in prompts/logs/output; use least-privilege credentials; rotate keys as appropriate. <br>

Expand All @@ -26,23 +26,23 @@ Mitigation: Review and scan skill before deployment. <br>

## Reference(s): <br>
- [cuOpt User Guide](https://docs.nvidia.com/cuopt/user-guide/latest/introduction.html) <br>
- [cuOpt Examples](https://github.com/NVIDIA/cuopt-examples) <br>
- [cuopt-examples](https://github.com/NVIDIA/cuopt-examples) <br>


## Skill Output: <br>
**Output Type(s):** [Analysis, Configuration instructions] <br>
**Output Format:** [Markdown with inline code blocks] <br>
**Output Format:** [Markdown with inline formulations and solver-call patterns] <br>
**Output Parameters:** [1D] <br>
**Other Properties Related to Output:** [None] <br>

## Evaluation Agents Used: <br>
- Claude Code (`claude-code`) <br>
- Codex (`codex`) <br>
- claude-code <br>
- codex <br>



## Evaluation Tasks: <br>
Evaluated against 5 evaluation tasks (4 positive activation, 1 negative activation) using NVSkills-Eval external profile in astra-sandbox environment. <br>
Evaluated against 5 internal evaluation tasks (4 positive skill-activation, 1 negative). <br>

## Evaluation Metrics Used: <br>
Reported benchmark dimensions: <br>
Expand All @@ -67,10 +67,10 @@ Underlying evaluation signals used in this run: <br>
| Dimension | Num | `claude-code` | `codex` |
|---|---:|---:|---:|
| Security | 5 | 100% (+0%) | 100% (+0%) |
| Correctness | 5 | 90% (+37%) | 76% (+24%) |
| Discoverability | 5 | 80% (+60%) | 71% (+41%) |
| Effectiveness | 5 | 91% (+20%) | 68% (+10%) |
| Efficiency | 5 | 80% (+40%) | 71% (+30%) |
| Correctness | 5 | 90% (+0%) | 87% (-8%) |
| Discoverability | 5 | 78% (-2%) | 73% (-22%) |
| Effectiveness | 5 | 85% (-6%) | 85% (+2%) |
| Efficiency | 5 | 75% (-3%) | 71% (-21%) |

## Skill Version(s): <br>
26.08.00 (source: frontmatter) <br>
Expand Down
2 changes: 1 addition & 1 deletion skills/cuopt-multi-objective-exploration/skill.oms.sig
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"mediaType":"application/vnd.dev.sigstore.bundle.v0.3+json","verificationMaterial":{"x509CertificateChain":{"certificates":[{"rawBytes":"MIICgzCCAgmgAwIBAgIUKIyS7SxNteQIiWzK1dWj85E6520wCgYIKoZIzj0EAwMwVTELMAkGA1UEBhMCVVMxGzAZBgNVBAoMEk5WSURJQSBDb3Jwb3JhdGlvbjEpMCcGA1UEAwwgTlZJRElBIEFnZW50IENhcGFiaWxpdGllcyBJQ0EgMDEwHhcNMjYwNDAxMDAwMDAwWhcNMjgwNDIyMTUzMzA5WjBUMQswCQYDVQQGEwJVUzEbMBkGA1UECgwSTlZJRElBIENvcnBvcmF0aW9uMSgwJgYDVQQDDB9OVklESUEgQWdlbnQgU2tpbGxzIFNpZ25pbmcgMDAxMHYwEAYHKoZIzj0CAQYFK4EEACIDYgAEYoRM9bQl/dGlwSRNi6bTpIJUXH8Nv9GciP6LSflJYYMLCc296kpyuTSsk5ddbAWiDcFX3C/ydX3jwc+qCLYP6uHy9XphyLjOQ27Yb2J6rBLVtRBS1mgGco/Gr7fL6ODco4GaMIGXMB0GA1UdDgQWBBRQ/5ZW3nJ6lmo9SVk7I15o7UGmpTAfBgNVHSMEGDAWgBRPGpILxMBBleJSsBGjrMKsby1CgjAMBgNVHRMBAf8EAjAAMA4GA1UdDwEB/wQEAwIHgDA3BggrBgEFBQcBAQQrMCkwJwYIKwYBBQUHMAGGG2h0dHA6Ly9vY3NwLm5kaXMubnZpZGlhLmNvbTAKBggqhkjOPQQDAwNoADBlAjAUygu/GiOCIXrgGr4SmLgeEVDcEitfFUv7ALbvLVGVyMysB3mxmO/uInZfXzWcJZsCMQDxuoxj4ZmO30jhkPIcCxGFCOvnUsnfU3TfGcouYm4M6iRpbKvtVnHPiy4bi6pcKf0="},{"rawBytes":"MIICiDCCAg6gAwIBAgIUZsIuSv9NkpJCNqtYEfCouVv5BzowCgYIKoZIzj0EAwMwUTELMAkGA1UEBhMCVVMxGzAZBgNVBAoMEk5WSURJQSBDb3Jwb3JhdGlvbjElMCMGA1UEAwwcTlZJRElBIEFnZW50IENhcGFiaWxpdGllcyBDQTAgFw0yNjA0MDEwMDAwMDBaGA85OTk5MTIzMTIzNTk1OVowVTELMAkGA1UEBhMCVVMxGzAZBgNVBAoMEk5WSURJQSBDb3Jwb3JhdGlvbjEpMCcGA1UEAwwgTlZJRElBIEFnZW50IENhcGFiaWxpdGllcyBJQ0EgMDEwdjAQBgcqhkjOPQIBBgUrgQQAIgNiAASI72cR3ctKGg4VWnB3bNja6g1Z2PnOmFEopkPof+QeIcPk9rT+g9MjJnq51EQXL93a7C2GJ9J985G4o2V85VD7wJ1RaXhluHW2rf3y8bQGeAYaKMr5s/hUgn+M3/9WlWejgaAwgZ0wHQYDVR0OBBYEFE8akgvEwEGV4lKwEaOswqxvLUKCMB8GA1UdIwQYMBaAFItnoAjjfuCEUvzyvWyI2vOGvwPjMBIGA1UdEwEB/wQIMAYBAf8CAQAwDgYDVR0PAQH/BAQDAgEGMDcGCCsGAQUFBwEBBCswKTAnBggrBgEFBQcwAYYbaHR0cDovL29jc3AubmRpcy5udmlkaWEuY29tMAoGCCqGSM49BAMDA2gAMGUCMQCeIMMfAbyzPDacw2MxG+Yt1cikrJX/DVxiGfXuHmkkXn6VgSzE79+lkqDErpVO2gYCMCNEColOyvUvkzZGUEI1hQ3PfMgi3FIo9tHoBKMw4/wGBLFpu/0ubtmbBXM6/UMOEw=="},{"rawBytes":"MIICRTCCAcygAwIBAgIUeJdY3rV86EdvFmG7L8LJBsyQFYkwCgYIKoZIzj0EAwMwUTELMAkGA1UEBhMCVVMxGzAZBgNVBAoMEk5WSURJQSBDb3Jwb3JhdGlvbjElMCMGA1UEAwwcTlZJRElBIEFnZW50IENhcGFiaWxpdGllcyBDQTAgFw0yNjA0MDEwMDAwMDBaGA85OTk5MTIzMTIzNTk1OVowUTELMAkGA1UEBhMCVVMxGzAZBgNVBAoMEk5WSURJQSBDb3Jwb3JhdGlvbjElMCMGA1UEAwwcTlZJRElBIEFnZW50IENhcGFiaWxpdGllcyBDQTB2MBAGByqGSM49AgEGBSuBBAAiA2IABAYpiXCDjJ9NT2eSDhyHJVSw1Tbze18cGG2F/578oWvHxg23eQAhNRYdq88i1iOshZSO6C29doKui5Xpmo/7Ctw9Sx4PP2RzOmIuOLCuTdNtKcTRwi4GEsd5BAFvWj42M6NjMGEwHQYDVR0OBBYEFItnoAjjfuCEUvzyvWyI2vOGvwPjMB8GA1UdIwQYMBaAFItnoAjjfuCEUvzyvWyI2vOGvwPjMA8GA1UdEwEB/wQFMAMBAf8wDgYDVR0PAQH/BAQDAgEGMAoGCCqGSM49BAMDA2cAMGQCMCwtAjWLaNwgGWNCgdyNoTyvNhqWRECRJV2r3+7w8g0PL6NHLOsbkgE09BH95h8XlgIwTaQmbbUh2ChAJ5TA1wRiVDnCcvbzHlZl2jM2FcwQQZlk19LOAbyGMRixbu2Ww/rj"}]},"tlogEntries":[]},"dsseEnvelope":{"payload":"ewogICJfdHlwZSI6ICJodHRwczovL2luLXRvdG8uaW8vU3RhdGVtZW50L3YxIiwKICAic3ViamVjdCI6IFsKICAgIHsKICAgICAgIm5hbWUiOiAiY3VvcHQtbXVsdGktb2JqZWN0aXZlLWV4cGxvcmF0aW9uIiwKICAgICAgImRpZ2VzdCI6IHsKICAgICAgICAic2hhMjU2IjogIjgwNmFlYWFjOWM3YmMxZTk1ZGJlNDdiOWVhY2FjMDk5NjBiZDczMTA4NTY0ZWE3NDk1OTdlYWI3ZDA2N2VlN2MiCiAgICAgIH0KICAgIH0KICBdLAogICJwcmVkaWNhdGVUeXBlIjogImh0dHBzOi8vbW9kZWxfc2lnbmluZy9zaWduYXR1cmUvdjEuMCIsCiAgInByZWRpY2F0ZSI6IHsKICAgICJyZXNvdXJjZXMiOiBbCiAgICAgIHsKICAgICAgICAibmFtZSI6ICJCRU5DSE1BUksubWQiLAogICAgICAgICJkaWdlc3QiOiAiOTcyM2E4ZjZjOWQ1ZTlkZDAxYTU0N2VlOGYyYzE2MzA5ZWU0YzIzMGY0NThiYjBiYWU1ODZjMTllMTJkNWE1MyIsCiAgICAgICAgImFsZ29yaXRobSI6ICJzaGEyNTYiCiAgICAgIH0sCiAgICAgIHsKICAgICAgICAibmFtZSI6ICJTS0lMTC5tZCIsCiAgICAgICAgImRpZ2VzdCI6ICI1MjQ0ODgwOTFlMTU0NWZmMGQxMDA0M2VmZjhiMDY2OTNkOTJiNTM5MzdkZjYxMjQyZWQzM2Q3NzZhZTM4NGZiIiwKICAgICAgICAiYWxnb3JpdGhtIjogInNoYTI1NiIKICAgICAgfSwKICAgICAgewogICAgICAgICJuYW1lIjogImV2YWxzL2V2YWxzLmpzb24iLAogICAgICAgICJkaWdlc3QiOiAiZDk1NjE1ZWUzMjc1MTc3MWYzMDc3ZjE3MzY4M2YwNGZkMTdhZDU3Y2U4MmFlOWIxMjRjYmQ4M2ZhYWJkNjBiYiIsCiAgICAgICAgImFsZ29yaXRobSI6ICJzaGEyNTYiCiAgICAgIH0sCiAgICAgIHsKICAgICAgICAibmFtZSI6ICJza2lsbC1jYXJkLm1kIiwKICAgICAgICAiZGlnZXN0IjogImM2OTJlM2Q5ZjAzMDQ1YWM2YTA5NzkwYWM4NjM4NGM4ODYxOTNhZmY2ODQxZDdjMDdhZThjMWZmYjAyYzdhMjkiLAogICAgICAgICJhbGdvcml0aG0iOiAic2hhMjU2IgogICAgICB9CiAgICBdLAogICAgInNlcmlhbGl6YXRpb24iOiB7CiAgICAgICJhbGxvd19zeW1saW5rcyI6IGZhbHNlLAogICAgICAibWV0aG9kIjogImZpbGVzIiwKICAgICAgImlnbm9yZV9wYXRocyI6IFsKICAgICAgICAiLmdpdCIsCiAgICAgICAgIi5naXRpZ25vcmUiLAogICAgICAgICIuZ2l0YXR0cmlidXRlcyIsCiAgICAgICAgIi5naXRodWIiCiAgICAgIF0sCiAgICAgICJoYXNoX3R5cGUiOiAic2hhMjU2IgogICAgfQogIH0KfQ==","payloadType":"application/vnd.in-toto+json","signatures":[{"sig":"MGUCMEbPeltBPGv9lD/CCSbTi12cHXvWbA2qoDfCcVLU420GcQ4kGa4r3aSmODk8E/ylNwIxALYaAi9ppzhEiqwoa7zbN1R0vq9QbUvNCcqFPcx7lLlHblGFS7nHB632atREfijbig==","keyid":""}]}}
{"mediaType":"application/vnd.dev.sigstore.bundle.v0.3+json","verificationMaterial":{"x509CertificateChain":{"certificates":[{"rawBytes":"MIICgzCCAgmgAwIBAgIUKIyS7SxNteQIiWzK1dWj85E6520wCgYIKoZIzj0EAwMwVTELMAkGA1UEBhMCVVMxGzAZBgNVBAoMEk5WSURJQSBDb3Jwb3JhdGlvbjEpMCcGA1UEAwwgTlZJRElBIEFnZW50IENhcGFiaWxpdGllcyBJQ0EgMDEwHhcNMjYwNDAxMDAwMDAwWhcNMjgwNDIyMTUzMzA5WjBUMQswCQYDVQQGEwJVUzEbMBkGA1UECgwSTlZJRElBIENvcnBvcmF0aW9uMSgwJgYDVQQDDB9OVklESUEgQWdlbnQgU2tpbGxzIFNpZ25pbmcgMDAxMHYwEAYHKoZIzj0CAQYFK4EEACIDYgAEYoRM9bQl/dGlwSRNi6bTpIJUXH8Nv9GciP6LSflJYYMLCc296kpyuTSsk5ddbAWiDcFX3C/ydX3jwc+qCLYP6uHy9XphyLjOQ27Yb2J6rBLVtRBS1mgGco/Gr7fL6ODco4GaMIGXMB0GA1UdDgQWBBRQ/5ZW3nJ6lmo9SVk7I15o7UGmpTAfBgNVHSMEGDAWgBRPGpILxMBBleJSsBGjrMKsby1CgjAMBgNVHRMBAf8EAjAAMA4GA1UdDwEB/wQEAwIHgDA3BggrBgEFBQcBAQQrMCkwJwYIKwYBBQUHMAGGG2h0dHA6Ly9vY3NwLm5kaXMubnZpZGlhLmNvbTAKBggqhkjOPQQDAwNoADBlAjAUygu/GiOCIXrgGr4SmLgeEVDcEitfFUv7ALbvLVGVyMysB3mxmO/uInZfXzWcJZsCMQDxuoxj4ZmO30jhkPIcCxGFCOvnUsnfU3TfGcouYm4M6iRpbKvtVnHPiy4bi6pcKf0="},{"rawBytes":"MIICiDCCAg6gAwIBAgIUZsIuSv9NkpJCNqtYEfCouVv5BzowCgYIKoZIzj0EAwMwUTELMAkGA1UEBhMCVVMxGzAZBgNVBAoMEk5WSURJQSBDb3Jwb3JhdGlvbjElMCMGA1UEAwwcTlZJRElBIEFnZW50IENhcGFiaWxpdGllcyBDQTAgFw0yNjA0MDEwMDAwMDBaGA85OTk5MTIzMTIzNTk1OVowVTELMAkGA1UEBhMCVVMxGzAZBgNVBAoMEk5WSURJQSBDb3Jwb3JhdGlvbjEpMCcGA1UEAwwgTlZJRElBIEFnZW50IENhcGFiaWxpdGllcyBJQ0EgMDEwdjAQBgcqhkjOPQIBBgUrgQQAIgNiAASI72cR3ctKGg4VWnB3bNja6g1Z2PnOmFEopkPof+QeIcPk9rT+g9MjJnq51EQXL93a7C2GJ9J985G4o2V85VD7wJ1RaXhluHW2rf3y8bQGeAYaKMr5s/hUgn+M3/9WlWejgaAwgZ0wHQYDVR0OBBYEFE8akgvEwEGV4lKwEaOswqxvLUKCMB8GA1UdIwQYMBaAFItnoAjjfuCEUvzyvWyI2vOGvwPjMBIGA1UdEwEB/wQIMAYBAf8CAQAwDgYDVR0PAQH/BAQDAgEGMDcGCCsGAQUFBwEBBCswKTAnBggrBgEFBQcwAYYbaHR0cDovL29jc3AubmRpcy5udmlkaWEuY29tMAoGCCqGSM49BAMDA2gAMGUCMQCeIMMfAbyzPDacw2MxG+Yt1cikrJX/DVxiGfXuHmkkXn6VgSzE79+lkqDErpVO2gYCMCNEColOyvUvkzZGUEI1hQ3PfMgi3FIo9tHoBKMw4/wGBLFpu/0ubtmbBXM6/UMOEw=="},{"rawBytes":"MIICRTCCAcygAwIBAgIUeJdY3rV86EdvFmG7L8LJBsyQFYkwCgYIKoZIzj0EAwMwUTELMAkGA1UEBhMCVVMxGzAZBgNVBAoMEk5WSURJQSBDb3Jwb3JhdGlvbjElMCMGA1UEAwwcTlZJRElBIEFnZW50IENhcGFiaWxpdGllcyBDQTAgFw0yNjA0MDEwMDAwMDBaGA85OTk5MTIzMTIzNTk1OVowUTELMAkGA1UEBhMCVVMxGzAZBgNVBAoMEk5WSURJQSBDb3Jwb3JhdGlvbjElMCMGA1UEAwwcTlZJRElBIEFnZW50IENhcGFiaWxpdGllcyBDQTB2MBAGByqGSM49AgEGBSuBBAAiA2IABAYpiXCDjJ9NT2eSDhyHJVSw1Tbze18cGG2F/578oWvHxg23eQAhNRYdq88i1iOshZSO6C29doKui5Xpmo/7Ctw9Sx4PP2RzOmIuOLCuTdNtKcTRwi4GEsd5BAFvWj42M6NjMGEwHQYDVR0OBBYEFItnoAjjfuCEUvzyvWyI2vOGvwPjMB8GA1UdIwQYMBaAFItnoAjjfuCEUvzyvWyI2vOGvwPjMA8GA1UdEwEB/wQFMAMBAf8wDgYDVR0PAQH/BAQDAgEGMAoGCCqGSM49BAMDA2cAMGQCMCwtAjWLaNwgGWNCgdyNoTyvNhqWRECRJV2r3+7w8g0PL6NHLOsbkgE09BH95h8XlgIwTaQmbbUh2ChAJ5TA1wRiVDnCcvbzHlZl2jM2FcwQQZlk19LOAbyGMRixbu2Ww/rj"}]},"tlogEntries":[]},"dsseEnvelope":{"payload":"ewogICJfdHlwZSI6ICJodHRwczovL2luLXRvdG8uaW8vU3RhdGVtZW50L3YxIiwKICAic3ViamVjdCI6IFsKICAgIHsKICAgICAgIm5hbWUiOiAiY3VvcHQtbXVsdGktb2JqZWN0aXZlLWV4cGxvcmF0aW9uIiwKICAgICAgImRpZ2VzdCI6IHsKICAgICAgICAic2hhMjU2IjogIjcxMmY0NDMwMjNkYTc5ODNmNzliOGIwNzlmN2YzODljM2IzNDJmNGY1NGZiNWViZGRhZmM1ODZiMjZlM2ZkYzIiCiAgICAgIH0KICAgIH0KICBdLAogICJwcmVkaWNhdGVUeXBlIjogImh0dHBzOi8vbW9kZWxfc2lnbmluZy9zaWduYXR1cmUvdjEuMCIsCiAgInByZWRpY2F0ZSI6IHsKICAgICJzZXJpYWxpemF0aW9uIjogewogICAgICAiYWxsb3dfc3ltbGlua3MiOiBmYWxzZSwKICAgICAgImhhc2hfdHlwZSI6ICJzaGEyNTYiLAogICAgICAibWV0aG9kIjogImZpbGVzIiwKICAgICAgImlnbm9yZV9wYXRocyI6IFsKICAgICAgICAiLmdpdGF0dHJpYnV0ZXMiLAogICAgICAgICIuZ2l0IiwKICAgICAgICAiLmdpdGlnbm9yZSIsCiAgICAgICAgIi5naXRodWIiCiAgICAgIF0KICAgIH0sCiAgICAicmVzb3VyY2VzIjogWwogICAgICB7CiAgICAgICAgImRpZ2VzdCI6ICIzNzhjNmMzMDY0YzVkNDE0ZjQ5M2M2NTZjODQ2YWZlYWMyMmMyMWU4NjhmY2MzMzRlMThiNGNjMmFmYjIwMjM4IiwKICAgICAgICAibmFtZSI6ICJCRU5DSE1BUksubWQiLAogICAgICAgICJhbGdvcml0aG0iOiAic2hhMjU2IgogICAgICB9LAogICAgICB7CiAgICAgICAgImRpZ2VzdCI6ICI5NWFhOTRiNGViNzQwOTE5YTI4MTFmNWY4OWRkZmVhZmVjZTkwNjcxNjJjMzE4NzRlYTBkYTM0ZGM4NWJhZDEyIiwKICAgICAgICAibmFtZSI6ICJTS0lMTC5tZCIsCiAgICAgICAgImFsZ29yaXRobSI6ICJzaGEyNTYiCiAgICAgIH0sCiAgICAgIHsKICAgICAgICAiZGlnZXN0IjogImQ5NTYxNWVlMzI3NTE3NzFmMzA3N2YxNzM2ODNmMDRmZDE3YWQ1N2NlODJhZTliMTI0Y2JkODNmYWFiZDYwYmIiLAogICAgICAgICJuYW1lIjogImV2YWxzL2V2YWxzLmpzb24iLAogICAgICAgICJhbGdvcml0aG0iOiAic2hhMjU2IgogICAgICB9LAogICAgICB7CiAgICAgICAgImRpZ2VzdCI6ICJiM2Y4NDZhMzExZDc5Zjg4ZTNhYzlmNGJmZmM3YjM4MDI4OGM5Nzk1ZTA3YzNhN2ViZWYxYjY2YWE0MGU3YTEzIiwKICAgICAgICAibmFtZSI6ICJza2lsbC1jYXJkLm1kIiwKICAgICAgICAiYWxnb3JpdGhtIjogInNoYTI1NiIKICAgICAgfQogICAgXQogIH0KfQ==","payloadType":"application/vnd.in-toto+json","signatures":[{"sig":"MGQCMC4Gw/I2q9Fy5WQ5dDlqOtYhZcp3y08aFKflnEzfYDWsZHEjlAIbutAhexTr80iANQIwIrWIJX73+kbJi/HXWsYHWBKF+3oqu0Yrkqgg2OFrlmdaDGzwZTKSFKqenudCk401","keyid":""}]}}
45 changes: 13 additions & 32 deletions skills/cuopt-routing-api-python/BENCHMARK.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,13 @@ This benchmark summarizes 3-Tier Evaluation from NVSkills-Eval results for the s
## Evaluation Summary

- Skill: `cuopt-routing-api-python`
- Evaluation date: 2026-05-29
- Evaluation date: 2026-06-29
- NVSkills-Eval profile: `external`
- Environment: `local`
- Environment: `astra-sandbox`
- Dataset: 1 evaluation tasks
- Attempts per task: 2
- Attempts per task: 1
- Pass threshold: 50%
- Overall verdict: FAIL
- Overall verdict: PASS

## Agents Used

Expand Down Expand Up @@ -54,46 +54,27 @@ Task composition is derived from the evaluation dataset when possible. Entries w

| Dimension | Num | `claude-code` | `codex` |
|---|---:|---:|---:|
| Security | 2 | 100% (+0%) | 100% (+0%) |
| Correctness | 2 | 100% (+0%) | 95% (+3%) |
| Discoverability | 2 | 100% (+0%) | 70% (-5%) |
| Effectiveness | 2 | 83% (+14%) | 83% (+12%) |
| Efficiency | 2 | 93% (-0%) | 56% (-5%) |
| Security | 1 | 100% (+0%) | 100% (+0%) |
| Correctness | 1 | 100% (+70%) | 97% (+42%) |
| Discoverability | 1 | 100% (+100%) | 82% (+57%) |
| Effectiveness | 1 | 82% (+45%) | 74% (+29%) |
| Efficiency | 1 | 95% (+67%) | 72% (+45%) |

Score values show skill-assisted performance. Values in parentheses show uplift versus the no-skill baseline when baseline data is available.

## Tier 1: Static Validation Summary

Tier 1 validation passed with observations. NVSkills-Eval ran 9 checks and found 8 total findings.
Tier 1 validation passed with observations. NVSkills-Eval ran 1 checks and found 2 total findings.

Top findings:

- MEDIUM SCHEMA/body_recommended_section: Missing recommended section: '## Instructions' (`skills/cuopt-routing-api-python/SKILL.md`)
- MEDIUM SECURITY/Unknown (SQP-2): Binding the cuOpt server to 0.0.0.0 exposes it on all network interfaces, making it accessible to any host that can reac (`references/server_examples.md:7`)
- LOW QUALITY/quality_discoverability: No '## Purpose' section (`skills/cuopt-routing-api-python/SKILL.md`)
- LOW QUALITY/quality_reliability: No prerequisites/requirements documented (`skills/cuopt-routing-api-python/SKILL.md`)
- LOW QUALITY/quality_reliability: No limitations documented (`skills/cuopt-routing-api-python/SKILL.md`)
- LOW SCHEMA/author_format: Author must be of the form 'Name <email@host>' (`skills/cuopt-routing-api-python/SKILL.md`)

## Tier 2: Deduplication Summary

Tier 2 validation reported findings. NVSkills-Eval ran 2 checks and found 4 total findings.

Top findings:

- HIGH DUPLICATE/duplicate: Duplicate content found within references/server_examples.md:
"# Poll for solution" in references/server_examples.md (lines 45-51)
vs "# Poll for solution" in references/server_examples.md (lines 156-162) (`references/server_examples.md:45`)
- HIGH DUPLICATE/duplicate: Duplicate content found across SKILL.md and references/examples.md:
"# Capacities" in SKILL.md (lines 30-35)
vs "# Add capacity dimension (name, demand_per_order, capacity_per_vehicle)" in references/examples.md (lines 73-75)
vs "# Add capacity dimension" in references/examples.md (lines 156-158) (`SKILL.md:30`)
- HIGH DUPLICATE/duplicate: Duplicate content found across references/examples.md and references/server_examples.md:
"## Additional References (tested in CI)" in references/examples.md (lines 237-249)
vs "## Additional References (tested in CI)" in references/server_examples.md (lines 193-204) (`references/examples.md:237`)
- HIGH DUPLICATE/duplicate: Duplicate content found across assets/pdp_basic/README.md and assets/pdp_basic/model.py:
"# Pickup-Delivery (PDP)" in assets/pdp_basic/README.md (lines 1-7)
vs "(module docstring)" in assets/pdp_basic/model.py (lines 1-2) (`assets/pdp_basic/README.md:1`)
This tier was not run or did not produce findings in this report.

## Publication Recommendation

The skill should be reviewed before NVSkills-Eval publication. Skill owners should address the findings above and rerun NVSkills-Eval to refresh this benchmark.
The skill is suitable to proceed toward NVSkills-Eval publication based on this benchmark. Skill owners should keep this file with the skill and refresh it when the evaluation dataset, skill behavior, or target agents materially change.
Loading