A lightweight operations dashboard and CLI for SFT dataset inspection, training-log monitoring, and pass@K candidate workflows.
中文说明 · English
SFT OPD is a small Go application for operating an SFT workspace. It serves a browser UI for browsing dataset artifacts and training tasks, and it ships a CLI for status checks, log inspection, candidate generation, grading, accepted-output merging, and remaining-ID preparation.
The project is intentionally self-contained: the web UI is embedded into the Go binary, the server uses the standard library HTTP stack, and the core dashboard works without a Node.js build step.
- Dataset cataloging for
jsonl,json,csv,parquet, andtxtfiles under a configured data root. - JSON and JSONL sample previews with math-friendly rendering in the browser.
- Training task pages backed by log discovery, tail output, progress counters, pass@K blocks, and task detail views.
- HTTP JSON APIs for health checks, catalog refreshes, file previews, training tasks, task events, and Asymptote rendering.
- Go CLI commands for training status, logs, specs, full-coverage launch, grading, merging accepted JSONL, and remaining source ID generation.
- Embedded static assets and templates, so a built binary can run the UI directly.
- Go 1.22 or newer.
- A local or remote SFT workspace with dataset files and training logs.
- Optional: Python and the open-r1 repository when using grading commands that call the math verification filter.
git clone https://github.com/Piping/sftopd.git
cd sftopd
go run ./cmd/sftopd \
--addr 127.0.0.1:6060 \
--data-root /data00/open-r1/data \
--log-root /data00/open-r1/logs \
--work-dir /data00/sftopdOpen http://127.0.0.1:6060.
For a remote host, forward the port first:
ssh -N -L 6060:127.0.0.1:6060 l20go build -o sftopd ./cmd/sftopd
./sftopd --addr 127.0.0.1:6060Server flags can also be supplied through environment variables.
| Flag | Environment | Default | Description |
|---|---|---|---|
--addr |
SFTOPD_ADDR |
127.0.0.1:6060 |
HTTP listen address. |
--data-root |
SFTOPD_DATA_ROOT |
/data00/open-r1/data |
Directory scanned for dataset artifacts. |
--log-root |
SFTOPD_LOG_ROOT |
/data00/open-r1/logs |
Directory scanned for training logs. |
--work-dir |
SFTOPD_WORK_DIR |
/data00/sftopd |
Runtime working directory for generated files. |
sftopd training status --json
sftopd training logs --tail 120
sftopd training specs
sftopd training launch-full-coverage --dry-run
sftopd training grade --work-dir /data00/sftopd/run-001
sftopd training merge-accepted --inputs a.jsonl,b.jsonl --output accepted.jsonl
sftopd training remaining-ids --source-ids source_ids.txt --accepted-jsonl accepted.jsonl --output remaining.txtsftopd training specs prints which legacy Python scripts are covered by the Go CLI and which workflows are still pending.
| Method | Path | Purpose |
|---|---|---|
GET |
/api/health |
Runtime health and effective configuration. |
GET |
/api/catalog |
Current catalog snapshot and aggregate stats. |
POST |
/api/catalog/refresh |
Rescan the data root. |
GET |
/api/files/{id}/preview |
Preview JSON or JSONL samples. |
GET |
/api/training/tasks |
List discovered training tasks. |
GET |
/api/training/tasks/{id} |
Read one task detail view. |
DELETE |
/api/training/tasks/{id} |
Delete an inactive task artifact. |
GET |
/api/training/tasks/{id}/events |
Stream task events. |
GET |
/api/training/specs |
List implemented and pending training workflow specs. |
POST |
/api/render/asy |
Render supported Asymptote snippets. |
cmd/sftopd/ CLI entrypoint and command flags
internal/app/ HTTP server, routes, templates, embedded web assets
internal/app/web/static/ Browser JavaScript, CSS, and vendored KaTeX assets
internal/data/ Dataset cataloging and preview logic
internal/training/ Training task discovery, grading, generation, merge helpers
docs/ Design notes, demos, and README screenshots
go test ./...
mkdir -p /tmp/sftopd-demo/data /tmp/sftopd-demo/logs /tmp/sftopd-demo/work
printf '%s\n' '{"source_id":"demo-001","problem":"Find x if x+7=12.","answer":"5"}' \
> /tmp/sftopd-demo/data/demo.jsonl
go run ./cmd/sftopd \
--data-root /tmp/sftopd-demo/data \
--log-root /tmp/sftopd-demo/logs \
--work-dir /tmp/sftopd-demo/workSee CONTRIBUTING.md for the contributor workflow, screenshot refresh steps, and review checklist.
SFT OPD is an operational tool for an active SFT workflow. The core dashboard, data browser, training monitor, and several training CLI replacements are implemented. Dataset preparation and some static repair or analysis flows are still tracked as pending by sftopd training specs.
SFT OPD is licensed under the Apache License 2.0.
