-
Notifications
You must be signed in to change notification settings - Fork 6
feat: first-class Windows support across CLI, runtime, and judges #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
zpzjzj
wants to merge
41
commits into
main
Choose a base branch
from
feat/windows-support
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
41 commits
Select commit
Hold shift + click to select a range
11093bc
ci: run build & test on a windows-latest runner
zpzjzj db2a610
feat: make the script judge cross-platform on Windows
zpzjzj 3c5099a
refactor(agent): route quoting through shellquote, fix Check on Windows
zpzjzj 13d8774
chore: add PowerShell tooling and example script for Windows
zpzjzj b303054
chore: pin line endings via .gitattributes
zpzjzj bcafdf0
docs: add Windows support guide
zpzjzj 616f476
docs: clarify OpenSandbox runtime behavior on Windows
zpzjzj 89a9498
docs: correct OpenSandbox Windows-sandbox availability
zpzjzj 17ccabb
fix(windows): use POSIX quoting and forward-slash paths where targets…
zpzjzj 73b296a
fix(windows): force -1 on ctx-cancel and make tests OS-aware
zpzjzj 8db5434
fix(windows): use POSIX quoting in git checkout helper and run tests …
zpzjzj daa9cdd
fix(windows): address PR #33 review nits
zpzjzj c38f955
fix(windows): always quote QuoteWindows output and disable cmd AutoRu…
zpzjzj 0585937
fix(windows): defang MSYS argv conversion and skip stdio-framing test
zpzjzj 90bcdaa
fix(windows): cap NoneRuntime.Exec pipe wait with cmd.WaitDelay
zpzjzj 2e30708
ci: make Windows job required now that all tests pass
zpzjzj 8144c27
fix(windows): skip WSL bash even when PATH-discovered
zpzjzj 6e11453
test(mcp): skip RejectsSymlinkEscape on Windows for the same Node-std…
zpzjzj acbf0d4
test(mcp): centralize Windows skip in startMockServer
zpzjzj f42dae8
fix(windows): reject WSL bash via SKILL_UP_BASH override too
zpzjzj 0b9498b
ci: add a Windows e2e job
zpzjzj 614438e
fix(e2e): emit skill-up.exe on Windows so TestMain build is executable
zpzjzj 0e38503
fix(windows): normalize transcript path and honor shebang options for…
zpzjzj 1298a43
fix(windows): escape bash actives in script paths; align WSL docs
zpzjzj 74246c9
fix(windows): make cmd fallback reliable, shell-aware quoter, env -S …
zpzjzj 1b3c51c
fix(windows): consume env value-flags and use shell-style env -S toke…
zpzjzj abd0a9d
fix(windows): bash-safe quoter, restrict shebang routing, /dev/null i…
zpzjzj 19d7f8f
fix(windows): handle env long-flag values and `\_` separator in env -S
zpzjzj 391a55a
refactor(windows): unify shell host, drop dead quote API, require Tar…
zpzjzj 1b75e55
fix(windows): cache platform.Host() and require bash for agent CLIs
zpzjzj 501905c
fix(judge): clean .sh temp dirs via bash rm on Windows
zpzjzj a2ca8f0
chore(windows): post-review cleanups (CI versions, docs, lint dead code)
zpzjzj 954e79a
fix(judge): handle env -a, optional-arg signal flags, NAME=VALUE, and \c
zpzjzj 0dca8b4
fix(judge): honor pwsh shebang and forward PowerShell shebang flags
zpzjzj cf17b53
chore(release): tag this branch's CHANGELOG entry as 0.3.0
zpzjzj d996465
refactor(agent): use platform.GOOSWindows constant in requireBashOnWi…
zpzjzj 8199082
refactor(judge): defensive copy in .ps1 plan.command to match .sh branch
zpzjzj c83477e
docs(runtime): clarify execContextGracePeriod applies on POSIX too
zpzjzj 6ac7525
fix(runtime): correct DockerRuntime.TargetGOOS doc comment
zpzjzj d5f1bd0
chore(release): bump 0.3.0 date to 2026-05-27 to follow main's 0.2.3
zpzjzj 9a69230
fix(windows): reject rooted POSIX-style paths in workspace/sandbox gu…
zpzjzj File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,16 @@ | ||
| # Keep line endings deterministic regardless of the contributor's OS or | ||
| # core.autocrlf setting. | ||
|
|
||
| # Go source must be LF: gofmt and the toolchain expect Unix line endings. | ||
| *.go text eol=lf | ||
|
|
||
| # Shell scripts must be LF: a CRLF shebang line (e.g. "#!/bin/sh\r") makes | ||
| # the kernel fail to locate the interpreter. | ||
| *.sh text eol=lf | ||
|
|
||
| # PowerShell handles LF on every platform; keep it consistent with .editorconfig. | ||
| *.ps1 text eol=lf | ||
|
|
||
| # Windows batch scripts use CRLF. | ||
| *.cmd text eol=crlf | ||
| *.bat text eol=crlf |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,109 @@ | ||
| # Windows Support | ||
|
|
||
| skill-up runs natively on Windows. This page covers what works, the current | ||
| limitations, and the recommended workflow. | ||
|
|
||
| --- | ||
|
|
||
| ## Supported | ||
|
|
||
| - **Build and unit tests** — `go build ./...` and `go test ./...` pass on | ||
| Windows. CI exercises a `windows-latest` runner alongside Linux. | ||
| - **The `none` runtime** — commands run on the host through `cmd.exe`. | ||
| - **The `opensandbox` runtime** — unaffected by the host OS; it always | ||
| executes inside a Linux sandbox. | ||
| - **The script judge** — dispatches by file extension (or shebang): | ||
|
|
||
| | Script | Interpreter on Windows | | ||
| | ----------------- | ------------------------------------------- | | ||
| | `.ps1` | PowerShell | | ||
| | `.cmd` / `.bat` | `cmd.exe` | | ||
| | `.sh` | bash (Git Bash; see below) | | ||
|
|
||
| ## Running `.sh` script judges on Windows | ||
|
|
||
| A `.sh` script judge needs a `bash` interpreter. skill-up looks for one in | ||
| this order: | ||
|
|
||
| 1. the `SKILL_UP_BASH` environment variable (an explicit path to `bash.exe`); | ||
| 2. `bash` on `PATH`; | ||
| 3. well-known Git Bash install locations — | ||
| `C:\Program Files\Git\bin\bash.exe` and | ||
| `C:\Program Files (x86)\Git\bin\bash.exe`. | ||
|
|
||
| If none is found the script judge fails with a clear error. Install | ||
| [Git for Windows](https://git-scm.com/download/win) or set `SKILL_UP_BASH`. | ||
|
|
||
| The WSL shim at `C:\Windows\System32\bash.exe` is intentionally rejected at | ||
| all three steps (override, PATH, well-known) because it expects Linux-format | ||
| `/mnt/c/...` paths and silently fails on the Windows-style paths skill-up | ||
| generates. Users who want to drive script judges through WSL must arrange | ||
| path translation upstream and point `SKILL_UP_BASH` at a non-WSL bash — or | ||
| simply run skill-up inside WSL itself (see "Recommended workflow" below). | ||
|
|
||
| ## OpenSandbox runtime on Windows | ||
|
|
||
| The `opensandbox` runtime talks to a remote OpenSandbox server over HTTP and | ||
| never spawns a host shell. Running `skill-up.exe` on native Windows against a | ||
| remote sandbox works today: all host-side path handling already crosses the | ||
| host→sandbox boundary through `filepath.ToSlash`, and the sandbox itself is a | ||
| Linux container, so the script judge and any agent run inside it behave | ||
| exactly as they do on Linux. | ||
|
|
||
| OpenSandbox also offers a [**Windows guest profile**](https://github.com/alibaba/OpenSandbox/blob/main/docs/windows-sandbox.md): | ||
| the server runs `dockur/windows` (Windows in KVM/QEMU inside a Linux container) | ||
| and the API accepts `platform: {"os": "windows", "arch": "amd64"}` on create. | ||
| At the time of writing the Go SDK does not yet expose the `Platform` field, so | ||
| driving a Windows-guest sandbox from skill-up is blocked on an upstream Go SDK | ||
| update — tracked separately. | ||
|
|
||
| For a Windows machine that needs the full agent workflow **without** a remote | ||
| sandbox, run skill-up inside **WSL2**. WSL2 is a Linux environment, so both the | ||
| `none` and `opensandbox` runtimes — including the agent Node/nvm bootstrap — | ||
| work without limitation. | ||
|
|
||
| ## Contributor tooling | ||
|
|
||
| `make` is not available on Windows by default. Use the PowerShell scripts | ||
| under `scripts/windows/` instead: | ||
|
|
||
| ```powershell | ||
| # Install git hooks (equivalent to `make hooks`) | ||
| pwsh scripts/windows/hooks.ps1 | ||
|
|
||
| # Install pinned lint tools into .tools/bin (equivalent to `make lint-tools`) | ||
| pwsh scripts/windows/lint-tools.ps1 | ||
|
|
||
| # fmt-check + vet + revive + golangci-lint (equivalent to `make verify`) | ||
| pwsh scripts/windows/verify.ps1 | ||
| ``` | ||
|
|
||
| Build and test use the standard Go toolchain, which is cross-platform: | ||
|
|
||
| ```powershell | ||
| go build -o bin/skill-up.exe ./cmd/skill-up | ||
| go test -race ./... | ||
| ``` | ||
|
|
||
| ## Known limitations | ||
|
|
||
| - **Running real agents natively** — Claude Code / Codex / Qoder CLI are | ||
| launched through a bash-based Node/nvm bootstrap. That bootstrap does not | ||
| run under `cmd.exe`. To run full agent evals on Windows, either install | ||
| Node.js and the agent CLIs yourself beforehand, or use WSL2. | ||
| - **`.ps1` script judges require a Windows target** — when the runtime target | ||
| is POSIX (for example the `opensandbox` Linux sandbox), only `.sh` scripts | ||
| are supported. | ||
| - **`cmd.exe` expands `%VAR%` inside arguments** — when no bash is discovered | ||
| and the `cmd /d /s /c` fallback shell runs, literal `%NAME%` substrings | ||
| inside command arguments are still expanded by cmd. There is no reliable | ||
| command-line escape for this. Do not interpolate untrusted strings into | ||
| shell commands. Install Git Bash (which skill-up auto-discovers) to avoid | ||
| the cmd fallback entirely. | ||
|
|
||
| ## Recommended workflow | ||
|
|
||
| - **Authoring and running script-judge evals** — native Windows works well. | ||
| Prefer `.ps1` script judges, or install Git for Windows for `.sh` support. | ||
| - **Running full agent evals** — use **WSL2**, so the evaluator and the agent | ||
| CLIs share one POSIX environment and avoid path/credential friction. |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.