feat(dev): align local dev stack with AI Gateway v0.5 + key-manager UI dev mode#115
feat(dev): align local dev stack with AI Gateway v0.5 + key-manager UI dev mode#115dcmcand wants to merge 4 commits into
Conversation
…ager UI dev mode Bump the dev/Makefile dependency stack to versions compatible with the bundled Envoy AI Gateway v0.5.0 (Envoy Gateway v1.6.7, Gateway API v1.4.0) and wire the AI Gateway ext_proc extension into Envoy Gateway at install time. On the previous versions (EG v1.3.0) a PassthroughModel reconciled to Ready but its upstream TLS was never programmed, so provider inference returned 503. Extend the dev manifests with the PassthroughModel RBAC, validating webhook, and shared-TLS issuance via the local self-signed ClusterIssuer, plus Makefile targets and an example model for the OpenRouter passthrough. Add an off-by-default dev mode to the key-manager: LLM_DEV_MODE bypasses auth and injects a fixed identity so the UI runs on a local cluster with no Keycloak. Exposed via keyManager.devMode in the Helm chart and enabled in the dev manifest, with a `make ui` port-forward target. Refs #113, #114
…pstream
Gateway API v1.4.0 (required by the bundled Envoy AI Gateway v0.5) graduates
BackendTLSPolicy to v1 and no longer serves v1alpha3, so the operator's
hardcoded v1alpha3 failed to apply on a version-aligned stack ("no matches
for kind BackendTLSPolicy in gateway.networking.k8s.io/v1alpha3") and the
passthrough upstream never got a TLS transport socket. Emit v1, which is the
same spec shape.
Refs #113
The key-manager watches PassthroughModels as well as LLMModels, but the dev
manifest's llm-key-manager-models ClusterRole only granted llmmodels, so model
sync failed ("cannot list passthroughmodels") and passthrough models never
appeared in the UI. Matches the chart's key-manager role.
Refs #114
… reload Frontend devs working on the key-manager UI now need only an OpenRouter key in dev/.env and `make run-dev`. The target idempotently brings up the kind cluster, operator, dev-mode key-manager, and three OpenRouter passthrough models, then port-forwards the key-manager and starts a hot-reloading UI dev server. - dev/uidev: a zero-dependency (stdlib-only) Go dev server that serves the UI static files from disk, proxies /api/* to the port-forwarded key-manager, and live-reloads the browser on file edits. The UI is plain static files, so no build step or npm is involved. - dev/run-dev.sh + `make run-dev`: orchestrates cluster/deploy/models/port-forward /UI server, loading OPENROUTER_API_KEY from a gitignored dev/.env. - dev/manifests/dev-models.yaml: three passthrough models so the UI list is populated. - docs/ui-development.md: frontend-dev guide (setup, editing, dev-mode auth, shipping changes, API table, troubleshooting), linked from getting-started. Refs #114
jbouder
left a comment
There was a problem hiding this comment.
Reviewed by actually running make run-dev on a fresh kind cluster (Helm 4). Two blockers stopped the bring-up (Gateway API CRD conflict; operator webhook startup race), plus a few smaller things. Details inline.
| # Envoy AI Gateway | ||
| helm upgrade -i aieg-crd oci://docker.io/envoyproxy/ai-gateway-crds-helm --version v0.5.0 -n envoy-ai-gateway-system --create-namespace | ||
| helm upgrade -i aieg oci://docker.io/envoyproxy/ai-gateway-helm --version v0.5.0 -n envoy-ai-gateway-system | ||
| kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/$(GATEWAY_API_VERSION)/standard-install.yaml |
There was a problem hiding this comment.
🔴 Blocker on Helm 4: Gateway API CRD ownership conflict.
On Helm 4 (server-side apply is the default), the eg install at L39 aborts because the chart re-applies these same Gateway API CRDs via SSA and collides with this client-side kubectl apply:
Error: failed to install CRD crds/gatewayapi-crds.yaml: conflict occurred while applying object
... conflicts with "kubectl-client-side-apply"
This blocks make setup / make run-dev on a fresh cluster for anyone on Helm 4. Both install the same pinned version, so taking ownership is safe — switch this to server-side apply:
- kubectl apply -f https://.../$(GATEWAY_API_VERSION)/standard-install.yaml
+ kubectl apply --server-side --force-conflicts -f https://.../$(GATEWAY_API_VERSION)/standard-install.yamland add --force-conflicts to the eg helm install (see comment on L39).
| # Envoy Gateway, wired with the AI Gateway ext_proc extension (enableBackend, | ||
| # extensionManager, backendResources). Without this the per-model routing | ||
| # layer 404s and passthrough upstreams never get a TLS transport socket. | ||
| helm upgrade -i eg oci://docker.io/envoyproxy/gateway-helm --version $(ENVOY_GATEWAY_VERSION) -n envoy-gateway-system --create-namespace -f eg-extension-values.yaml |
There was a problem hiding this comment.
Part of the Helm 4 CRD fix (see L27): add --force-conflicts so the chart's bundled Gateway API CRDs cleanly take ownership instead of erroring.
- helm upgrade -i eg oci://.../gateway-helm --version $(ENVOY_GATEWAY_VERSION) -n envoy-gateway-system --create-namespace -f eg-extension-values.yaml
+ helm upgrade -i eg oci://.../gateway-helm --version $(ENVOY_GATEWAY_VERSION) -n envoy-gateway-system --create-namespace --force-conflicts -f eg-extension-values.yaml| ./run-dev.sh | ||
|
|
||
| setup: ## Create kind cluster and install dependencies | ||
| kind create cluster --name $(CLUSTER_NAME) |
There was a problem hiding this comment.
🟡 make setup isn't resumable after a partial failure. kind create cluster errors hard (node(s) already exist for a cluster with the name ...) when the cluster exists, aborting the whole target. Since run-dev.sh only calls make setup when the cluster is absent, a setup that dies midway (e.g. the CRD conflict above) can't be recovered with make setup or make run-dev — you have to make teardown first. Guarding the create makes it idempotent:
@kind get clusters | grep -qx $(CLUSTER_NAME) || kind create cluster --name $(CLUSTER_NAME)| kubectl -n "$NS" create secret generic openrouter-api-key \ | ||
| --from-literal=apiKey="$OPENROUTER_API_KEY" \ | ||
| --dry-run=client -o yaml | kubectl apply -f - >/dev/null | ||
| kubectl apply -f manifests/dev-models.yaml >/dev/null |
There was a problem hiding this comment.
🔴 Webhook startup race — this apply fails intermittently and set -euo pipefail kills the run. The operator's validating webhook gates PassthroughModel creates, but it isn't serving yet right after make deploy:
Error from server (InternalError): ... failed calling webhook "vpassthroughmodel-v1alpha1.kb.io":
Post "https://llm-operator-webhook-service.../validate-...": dial tcp ...:443: connect: connection refused
Root cause is in operator.yaml (no readiness probe — see that comment); a bounded retry here makes it reliable regardless:
| kubectl apply -f manifests/dev-models.yaml >/dev/null | |
| # The operator's validating webhook gates PassthroughModel creates, and isn't | |
| # serving the instant `make deploy`'s rollout returns. Retry until it accepts. | |
| for attempt in $(seq 1 30); do | |
| kubectl apply -f manifests/dev-models.yaml >/dev/null 2>&1 && break | |
| if [[ $attempt -eq 30 ]]; then | |
| echo "ERROR: operator webhook never became ready" >&2 | |
| kubectl apply -f manifests/dev-models.yaml >&2 || true | |
| exit 1 | |
| fi | |
| echo "==> operator webhook not ready yet, retrying ($attempt)..." | |
| sleep 2 | |
| done |
|
|
||
| # Foreground: exits on Ctrl-C, which triggers cleanup of the port-forward. | ||
| ( cd uidev && go run . \ | ||
| -static ../../key-manager/internal/ui/static \ |
There was a problem hiding this comment.
🟢 Nit: /tmp/km-portforward.log is a fixed path. A stale file from a crashed prior run can make the grep -q "Forwarding from" readiness check below pass instantly against old output. mktemp would avoid that.
| value: "https://keycloak.local/realms/nebari" | ||
| - name: LLM_OIDC_GROUPS_CLAIM | ||
| value: "groups" | ||
| - name: ENABLE_WEBHOOKS |
There was a problem hiding this comment.
🟡 Root cause of the run-dev.sh webhook race lives here. Webhooks are enabled, and the container exposes containerPort: 9443, but the Deployment declares no readinessProbe — so kubectl rollout status in make deploy returns the moment the process launches, before the webhook binds its port (it waits on the cert-manager cert mount). Anything that applies a webhook-gated CR right after make deploy races the webhook.
Gating readiness on the webhook port makes rollout status mean "webhook ready" for every consumer of make deploy, not just run-dev.sh:
readinessProbe:
tcpSocket: { port: 9443 }
initialDelaySeconds: 2
periodSeconds: 2(A controller-runtime readyz check wired to the webhook server is even better.)
What and why
Makes the local
kinddev path actually run an external-providerPassthroughModelend to end, and lets the key-manager UI run without Keycloak.Closes #113
Closes #114
#113 - dev stack version mismatch
dev/Makefileinstalled Envoy Gateway v1.3.0 and Gateway API v1.2.1 alongside Envoy AI Gateway v0.5.0, which needs Envoy Gateway v1.6.x and Gateway API v1.4.0 (compatibility matrix). On v1.3.0 aPassthroughModelreconciled toReadybut the operator'sBackendTLSPolicywas never translated into an upstream TLS socket, so Envoy dialed the provider in plaintext and inference returned503 UC.dev/Makefileand move them as a set; bump to Envoy Gateway v1.6.7 and Gateway API v1.4.0.ext_procwiring (dev/eg-extension-values.yaml) and bring the AI Gateway up first so the extension server exists.PassthroughModelCRD inmake setup.dev/manifests/so the operator can serve passthrough: RBAC forpassthroughmodels,aiservicebackends/backendsecuritypolicies/backends/backendtlspolicies, and the shared-TLS reconciler (certificates,gateways); the passthrough validating webhook; and shared-TLS issuance through the existing localselfsigned-issuer(LLM_CLUSTER_ISSUER_NAME), so no ACME or hand-made cert is needed.maketargets:create-openrouter-secret,apply-passthrough-model,ui.PassthroughModelBackendTLSPolicyasgateway.networking.k8s.io/v1instead ofv1alpha3. Gateway API v1.4.0 graduates the policy tov1and no longer servesv1alpha3, so on the version-aligned stack the old apiVersion failed to apply and the upstream never got a TLS socket. This raises the effective floor for passthrough to Gateway API v1.4 / Envoy Gateway v1.6, which is what AI Gateway v0.5 already requires.#114 - key-manager UI dev mode
The UI could not run locally: the gateway enforces OIDC before forwarding, and the key-manager 401s without a JWT.
LLM_DEV_MODE(off by default) makes the auth middleware skip token handling and inject a fixed identity (LLM_DEV_USER,LLM_DEV_GROUPS). The production path is unchanged when unset, and a warning is logged when it is on.keyManager.devModein the Helm chart (default disabled) and enabled in the dev manifest.make uiport-forwards the Service so the gateway OIDC layer is bypassed too.One-command UI dev environment (
make run-dev)For frontend work on the key-manager UI,
make run-devis the whole setup: drop an OpenRouter key indev/.envand run it. It idempotently brings up the cluster, operator, dev-mode key-manager, and three passthrough models, then port-forwards the key-manager and starts a hot-reloading UI dev server.dev/uidev/is a zero-dependency (stdlib-only) Go dev server: it serves the UI static files from disk, proxies/api/*to the port-forwarded key-manager, and live-reloads the browser on edits. The UI is plain static files, so there is no build step or npm.dev/manifests/dev-models.yamlgives the UI a populated three-model list.docs/ui-development.mddocuments the workflow;dev/.envis gitignored.Verification
key-manager:go vetclean, fullgo test ./...passes, including a new table-drivenTestAuthMiddlewareDevMode.devMode.enabled=trueand omits it by default.go build,go vet, and thePassthroughreconciler tests pass with thev1assertion.make teardown && make setup && make build-images && make load-images && make deploy && make create-openrouter-secret && make apply-passthrough-modelbrings thePassthroughModeltoReady, the operator patches thellm-httpslistener, and a chat completion through the gateway returns a real OpenRouter response (200). The key-manager logs the dev-mode warning andGET /api/mereturns the injecteddevidentity (200, not401).Notes
ClusterIssuer/selfsigned-issuerit does not create; the dev path supplies one viacert-manager-config.yaml. Whether the chart should ship a dev issuer is out of scope here.llm-internal.<domain>) still requires a real Keycloak JWT even when access is public, so only the external endpoint is reachable on kind.