BAD PR 3 (DO NOT MERGE): regress dv-connect SKILL - hardcoded client_secret by saurabhrb · Pull Request #75 · microsoft/Dataverse-skills

saurabhrb · 2026-06-02T20:55:51Z

Intentional regression — branched off PR #72 (harness). Injects a Quick start snippet at top of dv-connect SKILL recommending a hardcoded client_secret literal. Pipeline testFile flipped to dv_connect.biceval.json. Expected to FAIL: connect_003_no_hardcoded_secrets PRIORITY_1 assertions; connect_001 may also degrade. Closes unmerged after evidence captured.

…/query/solution; convert dv_data to new format Adds 3-4 tests per skill using the new generic deterministic-evaluator format introduced in LocalEvalRunner: each test file now enables CortexConfigurations:Common/DeterministicAssertionEvaluator with settings.supported_verbs=CONTAINS,NOT_CONTAINS,SKILL_LOADED alongside the correctness.prompty semantic judge. Ports the dev/evalsV0 baseline tests (connect_001, metadata_001, overview_001, query_001, solution_001) and adds 2-3 natural follow-up tests per skill covering env-file contract, no-hardcoded-secrets, schema create + lookup relationship, filtered/aggregate reads, and solution unpack/import/routing-trap. dv_data.biceval.json is updated in place to add the deterministic evaluator.

…t graded Per LER PR-393 author guidance: DeterministicAssertionEvaluator only grades verb-prefixed assertions (CONTAINS/NOT_CONTAINS/SKILL_LOADED), and correctness.prompty scores against expected_response without seeing individual assertions. Without LMChecklist, natural-language assertions (those without a verb prefix) are unscored. Adds LMChecklist (Common/SEVAL/LMChecklist.prompty) to all six test files using the exact name + passing_score=3 + priority=1 shape the LER author published. Loader registration is already proven against the new format in Dataverse-skills PR #71's draft validation runs (LocalEvalRunner builds 20289122 and 20290419).

…ionService for local-eval mode Bad-PR validation harness. Routes around the pre-existing LER->BICEP ASCII-headers bug by forcing local evaluation: LMChecklist.prompty grades each assertion (verb-prefixed and natural-language alike) via CAPI/AOAI locally. Drops the DeterministicAssertionEvaluator entry because in DisableEvaluationService mode it would fall through to a local file lookup and crash (same root issue LER PR #416 fixes for the service-enabled path). Harness PR is draft and closes unmerged; PR #70 (test files) and PR #416 (LER fixes) are untouched.

…iles for local-eval mode

…ing range

… client_secret Intentional regression for bad-PR gate validation. Injects a Quick start snippet at the top of dv-connect SKILL that hardcodes client_secret in source instead of routing through scripts/auth.py / get_credential. Also points the pipeline default testFile to dv_connect.biceval.json so the relevant tests run. Expected to fail connect_003_no_hardcoded_secrets PRIORITY_1 assertions; connect_001 may also degrade.

saurabhrb · 2026-06-03T03:23:54Z

Closing -- validation harness / demo PR, no longer needed. Real coverage now lives in PR #70 (deterministic eval tests) and PR #68 (Cursor + auth unification).

Saurabh Badenkal added 6 commits June 1, 2026 22:38

harness: drop DeterministicAssertionEvaluator from remaining 5 test f…

44c07c0

…iles for local-eval mode

harness: lower LMChecklist passing_score 3 to 1 to match its 0/1 scor…

b705f6f

…ing range

saurabhrb force-pushed the users/saurabhrb/badpr-3-dv-connect-regression branch from 4229370 to 320ce81 Compare June 2, 2026 22:09

saurabhrb mentioned this pull request Jun 3, 2026

add: eval tests using new deterministic-evaluator format (3-4 per skill) #70

Open

saurabhrb closed this Jun 3, 2026

saurabhrb deleted the users/saurabhrb/badpr-3-dv-connect-regression branch June 3, 2026 03:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BAD PR 3 (DO NOT MERGE): regress dv-connect SKILL - hardcoded client_secret#75

BAD PR 3 (DO NOT MERGE): regress dv-connect SKILL - hardcoded client_secret#75
saurabhrb wants to merge 6 commits into
mainfrom
users/saurabhrb/badpr-3-dv-connect-regression

saurabhrb commented Jun 2, 2026

Uh oh!

saurabhrb commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

saurabhrb commented Jun 2, 2026

Uh oh!

saurabhrb commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant