Skip to content

feat(dev): align local dev stack with AI Gateway v0.5 + key-manager UI dev mode#115

Open
dcmcand wants to merge 4 commits into
mainfrom
feat/local-dev-passthrough-and-ui-devmode
Open

feat(dev): align local dev stack with AI Gateway v0.5 + key-manager UI dev mode#115
dcmcand wants to merge 4 commits into
mainfrom
feat/local-dev-passthrough-and-ui-devmode

Conversation

@dcmcand

@dcmcand dcmcand commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

What and why

Makes the local kind dev path actually run an external-provider PassthroughModel end to end, and lets the key-manager UI run without Keycloak.

Closes #113
Closes #114

#113 - dev stack version mismatch

dev/Makefile installed Envoy Gateway v1.3.0 and Gateway API v1.2.1 alongside Envoy AI Gateway v0.5.0, which needs Envoy Gateway v1.6.x and Gateway API v1.4.0 (compatibility matrix). On v1.3.0 a PassthroughModel reconciled to Ready but the operator's BackendTLSPolicy was never translated into an upstream TLS socket, so Envoy dialed the provider in plaintext and inference returned 503 UC.

  • Pin the dependency versions at the top of dev/Makefile and move them as a set; bump to Envoy Gateway v1.6.7 and Gateway API v1.4.0.
  • Install Envoy Gateway with the AI Gateway ext_proc wiring (dev/eg-extension-values.yaml) and bring the AI Gateway up first so the extension server exists.
  • Apply the PassthroughModel CRD in make setup.
  • Extend dev/manifests/ so the operator can serve passthrough: RBAC for passthroughmodels, aiservicebackends/backendsecuritypolicies/backends/backendtlspolicies, and the shared-TLS reconciler (certificates, gateways); the passthrough validating webhook; and shared-TLS issuance through the existing local selfsigned-issuer (LLM_CLUSTER_ISSUER_NAME), so no ACME or hand-made cert is needed.
  • New make targets: create-openrouter-secret, apply-passthrough-model, ui.
  • Operator: emit the PassthroughModel BackendTLSPolicy as gateway.networking.k8s.io/v1 instead of v1alpha3. Gateway API v1.4.0 graduates the policy to v1 and no longer serves v1alpha3, so on the version-aligned stack the old apiVersion failed to apply and the upstream never got a TLS socket. This raises the effective floor for passthrough to Gateway API v1.4 / Envoy Gateway v1.6, which is what AI Gateway v0.5 already requires.

#114 - key-manager UI dev mode

The UI could not run locally: the gateway enforces OIDC before forwarding, and the key-manager 401s without a JWT.

  • LLM_DEV_MODE (off by default) makes the auth middleware skip token handling and inject a fixed identity (LLM_DEV_USER, LLM_DEV_GROUPS). The production path is unchanged when unset, and a warning is logged when it is on.
  • Exposed as keyManager.devMode in the Helm chart (default disabled) and enabled in the dev manifest. make ui port-forwards the Service so the gateway OIDC layer is bypassed too.

One-command UI dev environment (make run-dev)

For frontend work on the key-manager UI, make run-dev is the whole setup: drop an OpenRouter key in dev/.env and run it. It idempotently brings up the cluster, operator, dev-mode key-manager, and three passthrough models, then port-forwards the key-manager and starts a hot-reloading UI dev server.

  • dev/uidev/ is a zero-dependency (stdlib-only) Go dev server: it serves the UI static files from disk, proxies /api/* to the port-forwarded key-manager, and live-reloads the browser on edits. The UI is plain static files, so there is no build step or npm.
  • dev/manifests/dev-models.yaml gives the UI a populated three-model list.
  • docs/ui-development.md documents the workflow; dev/.env is gitignored.

Verification

  • key-manager: go vet clean, full go test ./... passes, including a new table-driven TestAuthMiddlewareDevMode.
  • Chart lints; renders the dev-mode env when devMode.enabled=true and omits it by default.
  • Operator: go build, go vet, and the Passthrough reconciler tests pass with the v1 assertion.
  • Clean end-to-end on a fresh cluster: make teardown && make setup && make build-images && make load-images && make deploy && make create-openrouter-secret && make apply-passthrough-model brings the PassthroughModel to Ready, the operator patches the llm-https listener, and a chat completion through the gateway returns a real OpenRouter response (200). The key-manager logs the dev-mode warning and GET /api/me returns the injected dev identity (200, not 401).

Notes

  • The chart references a ClusterIssuer/selfsigned-issuer it does not create; the dev path supplies one via cert-manager-config.yaml. Whether the chart should ship a dev issuer is out of scope here.
  • The internal endpoint (llm-internal.<domain>) still requires a real Keycloak JWT even when access is public, so only the external endpoint is reachable on kind.

…ager UI dev mode

Bump the dev/Makefile dependency stack to versions compatible with the
bundled Envoy AI Gateway v0.5.0 (Envoy Gateway v1.6.7, Gateway API v1.4.0)
and wire the AI Gateway ext_proc extension into Envoy Gateway at install
time. On the previous versions (EG v1.3.0) a PassthroughModel reconciled to
Ready but its upstream TLS was never programmed, so provider inference
returned 503. Extend the dev manifests with the PassthroughModel RBAC,
validating webhook, and shared-TLS issuance via the local self-signed
ClusterIssuer, plus Makefile targets and an example model for the OpenRouter
passthrough.

Add an off-by-default dev mode to the key-manager: LLM_DEV_MODE bypasses auth
and injects a fixed identity so the UI runs on a local cluster with no
Keycloak. Exposed via keyManager.devMode in the Helm chart and enabled in the
dev manifest, with a `make ui` port-forward target.

Refs #113, #114
dcmcand added 3 commits June 26, 2026 10:46
…pstream

Gateway API v1.4.0 (required by the bundled Envoy AI Gateway v0.5) graduates
BackendTLSPolicy to v1 and no longer serves v1alpha3, so the operator's
hardcoded v1alpha3 failed to apply on a version-aligned stack ("no matches
for kind BackendTLSPolicy in gateway.networking.k8s.io/v1alpha3") and the
passthrough upstream never got a TLS transport socket. Emit v1, which is the
same spec shape.

Refs #113
The key-manager watches PassthroughModels as well as LLMModels, but the dev
manifest's llm-key-manager-models ClusterRole only granted llmmodels, so model
sync failed ("cannot list passthroughmodels") and passthrough models never
appeared in the UI. Matches the chart's key-manager role.

Refs #114
… reload

Frontend devs working on the key-manager UI now need only an OpenRouter key in
dev/.env and `make run-dev`. The target idempotently brings up the kind cluster,
operator, dev-mode key-manager, and three OpenRouter passthrough models, then
port-forwards the key-manager and starts a hot-reloading UI dev server.

- dev/uidev: a zero-dependency (stdlib-only) Go dev server that serves the UI
  static files from disk, proxies /api/* to the port-forwarded key-manager, and
  live-reloads the browser on file edits. The UI is plain static files, so no
  build step or npm is involved.
- dev/run-dev.sh + `make run-dev`: orchestrates cluster/deploy/models/port-forward
  /UI server, loading OPENROUTER_API_KEY from a gitignored dev/.env.
- dev/manifests/dev-models.yaml: three passthrough models so the UI list is
  populated.
- docs/ui-development.md: frontend-dev guide (setup, editing, dev-mode auth,
  shipping changes, API table, troubleshooting), linked from getting-started.

Refs #114

@jbouder jbouder left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed by actually running make run-dev on a fresh kind cluster (Helm 4). Two blockers stopped the bring-up (Gateway API CRD conflict; operator webhook startup race), plus a few smaller things. Details inline.

Comment thread dev/Makefile
# Envoy AI Gateway
helm upgrade -i aieg-crd oci://docker.io/envoyproxy/ai-gateway-crds-helm --version v0.5.0 -n envoy-ai-gateway-system --create-namespace
helm upgrade -i aieg oci://docker.io/envoyproxy/ai-gateway-helm --version v0.5.0 -n envoy-ai-gateway-system
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/$(GATEWAY_API_VERSION)/standard-install.yaml

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Blocker on Helm 4: Gateway API CRD ownership conflict.

On Helm 4 (server-side apply is the default), the eg install at L39 aborts because the chart re-applies these same Gateway API CRDs via SSA and collides with this client-side kubectl apply:

Error: failed to install CRD crds/gatewayapi-crds.yaml: conflict occurred while applying object
... conflicts with "kubectl-client-side-apply"

This blocks make setup / make run-dev on a fresh cluster for anyone on Helm 4. Both install the same pinned version, so taking ownership is safe — switch this to server-side apply:

-	kubectl apply -f https://.../$(GATEWAY_API_VERSION)/standard-install.yaml
+	kubectl apply --server-side --force-conflicts -f https://.../$(GATEWAY_API_VERSION)/standard-install.yaml

and add --force-conflicts to the eg helm install (see comment on L39).

Comment thread dev/Makefile
# Envoy Gateway, wired with the AI Gateway ext_proc extension (enableBackend,
# extensionManager, backendResources). Without this the per-model routing
# layer 404s and passthrough upstreams never get a TLS transport socket.
helm upgrade -i eg oci://docker.io/envoyproxy/gateway-helm --version $(ENVOY_GATEWAY_VERSION) -n envoy-gateway-system --create-namespace -f eg-extension-values.yaml

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part of the Helm 4 CRD fix (see L27): add --force-conflicts so the chart's bundled Gateway API CRDs cleanly take ownership instead of erroring.

-	helm upgrade -i eg oci://.../gateway-helm --version $(ENVOY_GATEWAY_VERSION) -n envoy-gateway-system --create-namespace -f eg-extension-values.yaml
+	helm upgrade -i eg oci://.../gateway-helm --version $(ENVOY_GATEWAY_VERSION) -n envoy-gateway-system --create-namespace --force-conflicts -f eg-extension-values.yaml

Comment thread dev/Makefile
./run-dev.sh

setup: ## Create kind cluster and install dependencies
kind create cluster --name $(CLUSTER_NAME)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 make setup isn't resumable after a partial failure. kind create cluster errors hard (node(s) already exist for a cluster with the name ...) when the cluster exists, aborting the whole target. Since run-dev.sh only calls make setup when the cluster is absent, a setup that dies midway (e.g. the CRD conflict above) can't be recovered with make setup or make run-dev — you have to make teardown first. Guarding the create makes it idempotent:

@kind get clusters | grep -qx $(CLUSTER_NAME) || kind create cluster --name $(CLUSTER_NAME)

Comment thread dev/run-dev.sh
kubectl -n "$NS" create secret generic openrouter-api-key \
--from-literal=apiKey="$OPENROUTER_API_KEY" \
--dry-run=client -o yaml | kubectl apply -f - >/dev/null
kubectl apply -f manifests/dev-models.yaml >/dev/null

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Webhook startup race — this apply fails intermittently and set -euo pipefail kills the run. The operator's validating webhook gates PassthroughModel creates, but it isn't serving yet right after make deploy:

Error from server (InternalError): ... failed calling webhook "vpassthroughmodel-v1alpha1.kb.io":
Post "https://llm-operator-webhook-service.../validate-...": dial tcp ...:443: connect: connection refused

Root cause is in operator.yaml (no readiness probe — see that comment); a bounded retry here makes it reliable regardless:

Suggested change
kubectl apply -f manifests/dev-models.yaml >/dev/null
# The operator's validating webhook gates PassthroughModel creates, and isn't
# serving the instant `make deploy`'s rollout returns. Retry until it accepts.
for attempt in $(seq 1 30); do
kubectl apply -f manifests/dev-models.yaml >/dev/null 2>&1 && break
if [[ $attempt -eq 30 ]]; then
echo "ERROR: operator webhook never became ready" >&2
kubectl apply -f manifests/dev-models.yaml >&2 || true
exit 1
fi
echo "==> operator webhook not ready yet, retrying ($attempt)..."
sleep 2
done

Comment thread dev/run-dev.sh

# Foreground: exits on Ctrl-C, which triggers cleanup of the port-forward.
( cd uidev && go run . \
-static ../../key-manager/internal/ui/static \

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Nit: /tmp/km-portforward.log is a fixed path. A stale file from a crashed prior run can make the grep -q "Forwarding from" readiness check below pass instantly against old output. mktemp would avoid that.

value: "https://keycloak.local/realms/nebari"
- name: LLM_OIDC_GROUPS_CLAIM
value: "groups"
- name: ENABLE_WEBHOOKS

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Root cause of the run-dev.sh webhook race lives here. Webhooks are enabled, and the container exposes containerPort: 9443, but the Deployment declares no readinessProbe — so kubectl rollout status in make deploy returns the moment the process launches, before the webhook binds its port (it waits on the cert-manager cert mount). Anything that applies a webhook-gated CR right after make deploy races the webhook.

Gating readiness on the webhook port makes rollout status mean "webhook ready" for every consumer of make deploy, not just run-dev.sh:

readinessProbe:
  tcpSocket: { port: 9443 }
  initialDelaySeconds: 2
  periodSeconds: 2

(A controller-runtime readyz check wired to the webhook server is even better.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

2 participants