Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ dispatch owns one `codex app-server` subprocess (stdio JSONL, shared `~/.codex`)
- `docs/usage/` — operator docs for the CLI, MCP, triggers, and plugin setup.
- `.agents/plans/v0/` — phased plan (`PLAN.md`) + references (`REFS.md`); tracked.
- `spikes/` — App Server probe scripts; seed of the integration suite.
- `tests/fixtures/` — small named App Server, JSONL, CLI-smoke, and registry fixtures.
- `.agents/notes/` — working notes, session recaps, learnings; **gitignored, local only**.
- `skills/` — first-party Codex skills for operating dispatch (`dispatch`) and dispatch-backed direct messages (`dm`).
- `plugins/dispatch/` — workspace-local Codex plugin bundle exposing the skills and MCP server.
Expand Down Expand Up @@ -64,6 +65,7 @@ Use the project language consistently:
- **App Server access only via `client/`.** Never spawn or speak to `codex app-server` outside the client layer. See [client rules](.claude/rules/client.md).
- **Async core, sync CLI.** The daemon is asyncio end-to-end; the CLI is a thin sync client over the control socket. No blocking calls in the loop (use `aiosqlite`, asyncio subprocess, `run_in_executor`). See [python-conventions](.claude/rules/python-conventions.md).
- **Never touch the user's live state in tests.** Integration tests use a real ephemeral app-server with an isolated `CODEX_HOME` and `ephemeral:true` lanes.
- **Fixtures should be exercised.** Add checked-in cases under `tests/fixtures/` only when a test loads them; prefer Python builders over binary SQLite fixtures.

## Source control

Expand Down
2 changes: 2 additions & 0 deletions docs/development/design.md
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,7 @@ The client supports the full responder loop. v1 surfaces `waiting_on_approval` a
- Async: stdlib **asyncio** (subprocess + streams + unix socket server). DB: **aiosqlite** (hand-written SQL; no ORM). Logging: **structlog** (also feeds the audit log).
- MCP: the official Python **`mcp`** SDK (stdio transport first). Scheduling: small custom asyncio scheduler + `croniter` for cron (interval needs no lib). No `dateutil`/RRULE in v1.
- Tests: **pytest** + **pytest-asyncio**. Hooks: **lefthook** (polyglot; runs ruff/mypy/pytest). Task runner: **just** (justfile) for `test`/`lint`/`typecheck`/`run`. Daemon keep-alive: **launchd** LaunchAgent plist. CI: GitHub Actions + `astral-sh/setup-uv`.
- Fixture corpus: `tests/fixtures/` stores small named App Server payloads, Codex JSONL sync sources, CLI-smoke notes, and registry builders. Every checked-in fixture should be loaded by a test. Prefer builders over binary SQLite files.

## Data model (registry, SQLite)

Expand All @@ -185,6 +186,7 @@ The client supports the full responder loop. v1 surfaces `waiting_on_approval` a
- Promote the existing probe scripts (`/tmp/codex_{stdio,dm,lab4,fanout}.py`) into the integration suite, run against a **real ephemeral app-server with an isolated `CODEX_HOME`** (zero pollution; `ephemeral:true` lanes).
- `test_examples(registry)` runs op examples as assertions.
- Unit: message router (canned JSONL), trigger/guard evaluation, registry, error projections.
- Release smoke: `just pypi-smoke -- --package-spec outfitter-dispatch==<version>` installs the published package with `uvx`, uses a temporary `DISPATCH_HOME`, verifies daemon/model/list paths, and shuts down cleanly.

## Rough build slices (detailed by the implementation plan)

Expand Down
15 changes: 15 additions & 0 deletions docs/usage/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,18 @@ dispatch daemon status
dispatch down --json
```

Maintainers can run the same release smoke from the repository against the
published package:

```bash
just pypi-smoke -- --package-spec outfitter-dispatch==0.5.0
```

The smoke installs with `uvx`, uses a temporary `DISPATCH_HOME`, verifies the
derived `models` schema, starts the daemon, reads the live App Server model
catalog, verifies the cached registry read, checks the empty first-run lane list,
and shuts the daemon down.

If `dispatch doctor` fails before the app-server smoke because the Codex CLI is
not installed or authenticated, fix that first and rerun the doctor. Use
`dispatch doctor --no-app-server` when you only need to inspect package, PATH,
Expand Down Expand Up @@ -141,6 +153,9 @@ release, bump `project.version` in `pyproject.toml`, run:
just check
```

After the GitHub Release publishes to PyPI, run `just pypi-smoke -- --package-spec
outfitter-dispatch==<version>` to verify the public install path.

Then create and publish a GitHub Release for the same tag, for example
`v0.1.0`. Do not upload with a long-lived PyPI token unless the trusted
publisher path is unavailable and the maintainer explicitly chooses that
Expand Down
4 changes: 4 additions & 0 deletions justfile
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,10 @@ test *args:
test-int *args:
uv run pytest -m integration {{args}}

# Smoke-test the published PyPI package from a clean temporary DISPATCH_HOME.
pypi-smoke *args:
uv run python scripts/check_pypi_smoke.py {{args}}

# Lint with ruff.
lint:
uv run ruff check .
Expand Down
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ exclude = ["/src/**/AGENTS.md"]
include = [
"/src",
"/tests",
"/scripts",
"/skills",
"/plugins/dispatch",
"/README.md",
Expand Down
163 changes: 163 additions & 0 deletions scripts/check_pypi_smoke.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
"""Smoke-test the published PyPI package from a clean Dispatch home.

This is intentionally not part of ``just check``: it installs from PyPI with
``uvx`` and starts a real daemon/app-server. Run it after publishing or when
validating the clean-install path tracked by GitHub issue #27.
"""

from __future__ import annotations

import argparse
import json
import os
import shutil
import subprocess
import sys
import tempfile
import tomllib
from pathlib import Path
from typing import Any


def main(argv: list[str] | None = None) -> int:
args = _parse_args(argv)
package_spec = args.package_spec or f"outfitter-dispatch=={_project_version()}"
home = Path(tempfile.mkdtemp(prefix="dispatch-pypi-smoke."))
env = os.environ.copy()
env["DISPATCH_HOME"] = str(home)
print(f"DISPATCH_HOME={home}")
print(f"package={package_spec}")
try:
version = _dispatch(package_spec, ["--version"], env)
_expect(version.stdout.strip().startswith("dispatch "), version.stdout)

schema = _dispatch_json(package_spec, ["schema", "models"], env)
_expect(schema.get("op") == "models", "schema op is not models")
_expect(_path(schema, "input", "properties", "refresh", "type") == "boolean", schema)
_expect(_path(schema, "output", "properties", "models", "type") == "array", schema)

up = _dispatch_json(package_spec, ["up", "--json"], env, timeout=args.timeout)
_expect(up.get("status") in {"started", "running"}, up)

models = _dispatch_json(package_spec, ["models", "--json"], env, timeout=args.timeout)
_expect(models.get("source") == "app-server", models)
_expect(_nonempty_list(models.get("models")), models)
configured = models.get("configured_default")
_expect(isinstance(configured, dict), models)
_expect(isinstance(configured.get("model"), str), models)

cached = _dispatch_json(
package_spec, ["models", "--no-refresh", "--json"], env, timeout=args.timeout
)
_expect(cached.get("source") == "registry", cached)
_expect(_nonempty_list(cached.get("models")), cached)

lanes = _dispatch_json(package_spec, ["list", "--json"], env)
_expect(isinstance(lanes.get("lanes"), list), lanes)

down = _dispatch_json(package_spec, ["down", "--json"], env)
_expect(down.get("status") == "stopped", down)
print("PyPI clean-install smoke passed")
return 0
finally:
if not args.keep_home:
_dispatch(package_spec, ["down", "--json"], env, check=False)
shutil.rmtree(home, ignore_errors=True)


def _parse_args(argv: list[str] | None) -> argparse.Namespace:
raw_args = sys.argv[1:] if argv is None else argv
if raw_args and raw_args[0] == "--":
raw_args = raw_args[1:]
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument(
"--package-spec",
help="uvx package spec to install, e.g. outfitter-dispatch==0.5.0",
)
parser.add_argument(
"--timeout",
type=float,
default=90.0,
help="seconds to allow each daemon/app-server command",
)
parser.add_argument(
"--keep-home",
action="store_true",
help="keep the temporary DISPATCH_HOME for debugging",
)
return parser.parse_args(raw_args)


def _project_version() -> str:
with Path("pyproject.toml").open("rb") as handle:
project = tomllib.load(handle)["project"]
version = project["version"]
if not isinstance(version, str):
raise SystemExit("pyproject.toml project.version is not a string")
return version


def _dispatch(
package_spec: str,
args: list[str],
env: dict[str, str],
*,
timeout: float = 90.0,
check: bool = True,
) -> subprocess.CompletedProcess[str]:
result = subprocess.run(
["uvx", "--from", package_spec, "dispatch", *args],
env=env,
text=True,
capture_output=True,
timeout=timeout,
check=False,
)
if check and result.returncode != 0:
raise SystemExit(
f"dispatch {' '.join(args)} failed with {result.returncode}\n"
f"stdout:\n{result.stdout}\nstderr:\n{result.stderr}"
)
return result


def _dispatch_json(
package_spec: str,
args: list[str],
env: dict[str, str],
*,
timeout: float = 90.0,
) -> dict[str, Any]:
result = _dispatch(package_spec, args, env, timeout=timeout)
try:
parsed = json.loads(result.stdout)
except json.JSONDecodeError as exc:
raise SystemExit(
f"dispatch {' '.join(args)} did not produce JSON\n"
f"stdout:\n{result.stdout}\nstderr:\n{result.stderr}"
) from exc
if not isinstance(parsed, dict):
raise SystemExit(f"dispatch {' '.join(args)} produced non-object JSON")
return parsed


def _path(data: dict[str, Any], *parts: str) -> Any:
value: Any = data
for part in parts:
if not isinstance(value, dict):
return None
value = value.get(part)
return value


def _nonempty_list(value: object) -> bool:
return isinstance(value, list) and len(value) > 0


def _expect(condition: bool, detail: object) -> None:
if not condition:
raise SystemExit(f"PyPI smoke assertion failed: {detail!r}")


if __name__ == "__main__":
sys.exit(main())
36 changes: 5 additions & 31 deletions tests/client/test_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
from outfitter.dispatch.client.client import AppServerClient
from outfitter.dispatch.client.errors import AppServerError, ProtocolError, TransportError
from outfitter.dispatch.client.events import TurnCompleted
from tests.fixtures import load_json

from .conftest import FakeTransport, Responder

Expand Down Expand Up @@ -220,45 +221,18 @@ async def test_config_read_and_model_list_parse_current_catalog_shape(
client: tuple[AppServerClient, FakeTransport],
) -> None:
c, fake = client
fake.auto = _result_for(
"config/read",
{
"config": {
"model": "gpt-5.5",
"modelProvider": "openai",
"serviceTier": "priority",
"modelReasoningEffort": "xhigh",
}
},
)
fake.auto = _result_for("config/read", load_json("app_server", "config_read", "current.json"))
config = await c.config_read()
assert config.model == "gpt-5.5"
assert config.model_provider == "openai"
assert config.service_tier == "priority"
assert config.service_tier == "fast"
assert config.model_reasoning_effort == "xhigh"
assert fake.sent[-1] == {"id": 1, "method": "config/read", "params": {}}

fake.auto = _result_for(
"model/list",
{
"data": [
{
"id": "gpt-5.5",
"defaultReasoningEffort": "xhigh",
"supportedReasoningEfforts": ["low", "xhigh"],
"serviceTiers": [
{
"id": "priority",
"name": "Fast",
"description": "1.5x speed, increased usage",
}
],
}
]
},
)
fake.auto = _result_for("model/list", load_json("app_server", "model_list", "current.json"))
models = await c.model_list()
assert models[0].id == "gpt-5.5"
assert models[0].supported_reasoning_efforts == ["low", "medium", "high", "xhigh"]
assert models[0].service_tiers[0].id == "priority"
assert models[0].service_tiers[0].name == "Fast"
assert fake.sent[-1] == {"id": 2, "method": "model/list", "params": {}}
Expand Down
71 changes: 31 additions & 40 deletions tests/client/test_models.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
TurnStartParams,
TurnSteerParams,
)
from tests.fixtures import load_json


def test_thread_start_sandbox_is_string_enum() -> None:
Expand Down Expand Up @@ -185,52 +186,42 @@ def test_thread_info_keeps_observed_model_service_tier() -> None:


def test_config_and_model_catalog_wire_models_accept_camel_case() -> None:
config = ConfigInfo.model_validate(
{
"model": "gpt-5.5",
"modelProvider": "openai",
"serviceTier": "priority",
"modelReasoningEffort": "xhigh",
}
config_payload = load_json("app_server", "config_read", "current.json")["config"]
config = ConfigInfo.model_validate(config_payload)
catalog = ModelListResult.model_validate(load_json("app_server", "model_list", "current.json"))

assert config.model_provider == "openai"
assert config.service_tier == "fast"
assert catalog.data[0] == AppModel(
id="gpt-5.5",
model="gpt-5.5",
display_name="GPT-5.5",
description="Frontier model for complex coding, research, and real-world work.",
is_default=True,
hidden=False,
default_reasoning_effort="medium",
supported_reasoning_efforts=["low", "medium", "high", "xhigh"],
service_tiers=[
ModelServiceTier(
id="priority",
name="Fast",
description="1.5x speed, increased usage",
)
],
)


def test_legacy_model_catalog_fixture_keeps_speed_tier_fallback() -> None:
catalog = ModelListResult.model_validate(
{
"data": [
{
"id": "gpt-5.5",
"displayName": "GPT-5.5",
"defaultReasoningEffort": "xhigh",
"supportedReasoningEfforts": [
{"reasoningEffort": "low", "description": "faster"},
{"reasoningEffort": "xhigh", "description": "deeper"},
],
"serviceTiers": [
{
"id": "priority",
"name": "Fast",
"description": "1.5x speed, increased usage",
}
],
"additionalSpeedTiers": ["fast"],
}
]
}
load_json("app_server", "model_list", "legacy_additional_speed_tiers.json")
)

assert config.model_provider == "openai"
assert catalog.data == [
AppModel(
id="gpt-5.5",
display_name="GPT-5.5",
default_reasoning_effort="xhigh",
supported_reasoning_efforts=["low", "xhigh"],
service_tiers=[
ModelServiceTier(
id="priority",
name="Fast",
description="1.5x speed, increased usage",
)
],
id="legacy-fast-model",
display_name="Legacy Fast Model",
default_reasoning_effort="medium",
supported_reasoning_efforts=["low", "medium"],
additional_speed_tiers=["fast"],
)
]
Expand Down
Loading