diff --git a/specs/GH4687/product.md b/specs/GH4687/product.md new file mode 100644 index 000000000..fe0ca3f24 --- /dev/null +++ b/specs/GH4687/product.md @@ -0,0 +1,90 @@ +# Product Spec: OpenAI-compatible BYOK endpoints + +Issue: https://github.com/warpdotdev/warp/issues/4687 +Related PR: https://github.com/warpdotdev/warp/pull/9253 +Figma: none provided + +## Summary + +Users should be able to configure a custom OpenAI-compatible model endpoint for Warp Agent by entering a provider label, base URL, API key, and model ID. This gives users a narrow, predictable way to use backend-reachable HTTPS endpoints such as OpenRouter, hosted LiteLLM gateways, and company-managed OpenAI-compatible gateways exposed to Warp's backend without requiring Warp to add a bespoke provider integration for each one. + +This spec is intentionally scoped to one generic "OpenAI-compatible" provider surface. It does not replace Warp-provided models, existing BYOK provider keys, enterprise model configuration, or future first-class provider integrations. + +## Problem + +Warp currently exposes BYOK for fixed providers and chooses models from server-provided model choices. That works when the desired provider and model are already known to Warp, but it leaves no self-service path for users who already have access to an OpenAI-compatible endpoint. + +The common desired workflow is: + +1. Use Warp's existing agent UX, permissions, execution profiles, and terminal context. +2. Point model requests at a compatible endpoint such as `https://openrouter.ai/api/v1`. +3. Provide an API key for that endpoint. +4. Enter the model ID used by that endpoint, such as `anthropic/claude-sonnet-4.5`. + +PR #9253 is useful partial progress for OpenRouter-specific BYOK, but it still depends on Warp-approved model choices. The remaining user need is the generic endpoint contract from issue #4687. + +## Goals + +1. A user can add one custom OpenAI-compatible endpoint configuration from Settings > AI. +2. The configuration includes: + - provider label + - base URL + - API key + - model ID +3. The configured model appears in the existing model picker as a selectable BYOK model. +4. Selecting the model causes Warp Agent requests to use the configured endpoint and model ID. +5. The feature works for OpenRouter and other backend-reachable HTTPS OpenAI-compatible endpoints that support Warp's required chat/agent request shape. +6. Existing OpenAI, Anthropic, Google, OpenRouter, AWS Bedrock, and Warp-credit model behavior remains unchanged. + +## Non-goals + +1. Fetching arbitrary provider model catalogs client-side. +2. Adding separate first-class UI for OpenRouter, LiteLLM, Ollama, Azure, or any other provider in this initial flow. +3. Supporting `localhost`, private-network, link-local, or otherwise client-local model endpoints in the first version. Those require a separate client-side/local routing design because backend-routed requests cannot reach the user's machine as `localhost`. +4. Supporting non-OpenAI-compatible protocols in this feature. +5. Guaranteeing every model behind a compatible endpoint supports all Warp Agent tools. +6. Changing the paid plan or workspace policy that gates BYOK access. +7. Changing where Warp Agent requests are executed or proxied beyond the existing BYOK request architecture. +8. Supporting multiple custom endpoint profiles in the first version. The UI and data model should not preclude this as a follow-up. + +## Behavior + +1. When BYOK is available for the current user or workspace, Settings > AI shows a section for "OpenAI-compatible endpoint" in addition to the existing provider API key inputs. +2. The section contains four inputs: + - Label, defaulting to `OpenAI-compatible` + - Base URL, for example `https://openrouter.ai/api/v1` + - API key + - Model ID, for example `anthropic/claude-sonnet-4.5` +3. Empty label uses the default label. Empty base URL, API key, or model ID means the custom endpoint is incomplete and should not appear as a selectable model. +4. Base URL validation is lightweight and user-facing in the client: the value must parse as an absolute `https` URL and must not use obvious local/private hosts such as `localhost` or loopback IPs. Warp does not perform a network validation request when the user saves the setting; the backend performs the authoritative egress validation before routing any request. +5. API key input uses the same password-style treatment as existing BYOK provider key fields. +6. The saved custom model appears in the existing model picker using the configured label and model ID. A key icon or equivalent BYOK affordance should make clear that it is billed to the user's endpoint credentials. +7. Selecting the custom model persists through the same execution profile mechanism as other model choices. +8. Agent requests for the custom model include a distinct `custom_model_endpoint` settings payload needed by Warp's backend to route the request: + - base URL + - API key + - model ID + - provider label for display/diagnostics +9. If the endpoint returns an authentication error, the user sees an invalid API key/error state that names the configured provider label when available. +10. If the endpoint or model is unsupported by Warp's agent backend, the error should explain that the custom endpoint could not satisfy the request rather than asking the user to upgrade Warp credits. +11. Existing fixed-provider BYOK fields continue to work. Adding a custom endpoint does not clear or override OpenAI, Anthropic, Google, OpenRouter, or AWS Bedrock credentials. +12. Disabling BYOK at the workspace/plan level preserves any stored custom endpoint config, but disables custom endpoint editing and selection using the same gating behavior as existing BYOK fields. The stored config is cleared only if the user explicitly deletes it. + +## Success criteria + +1. A user can configure OpenRouter with: + - Base URL: `https://openrouter.ai/api/v1` + - Model ID: an OpenRouter model slug + - API key: an OpenRouter key +2. The configured model can be selected in the model picker. +3. Agent requests with that model carry the custom endpoint config to the request layer. +4. Incomplete custom endpoint config does not create a broken model picker entry. +5. Existing BYOK provider-key behavior is unchanged. +6. Existing Warp-credit model behavior is unchanged. +7. Localhost/private-network endpoints are rejected or kept out of the selectable V1 flow rather than silently routing to Warp backend-local addresses. + +## Open questions + +1. Should the first implementation support exactly one custom endpoint, or should the persistence shape support a list immediately while the UI initially exposes one? +2. Should the backend allow custom endpoint routing for all agent features immediately, or gate specific tool-heavy flows until compatibility is proven? +3. Should Warp add a preset button for OpenRouter after the generic flow lands, or keep the first version purely generic? diff --git a/specs/GH4687/tech.md b/specs/GH4687/tech.md new file mode 100644 index 000000000..9ca98e209 --- /dev/null +++ b/specs/GH4687/tech.md @@ -0,0 +1,261 @@ +# Tech Spec: OpenAI-compatible BYOK endpoints + +Issue: https://github.com/warpdotdev/warp/issues/4687 +Product spec: `specs/GH4687/product.md` + +## Context + +Warp already has the main pieces needed for provider-specific BYOK: + +- secure local credential storage +- Settings > AI provider key inputs +- server-provided model choices +- model-picker BYOK affordances +- request-time API key payloads sent with Warp Agent requests + +The missing piece for issue #4687 is a custom endpoint model entry that carries a user-provided base URL and model ID, instead of requiring the model to exist in Warp's server-approved model list. V1 is limited to backend-reachable HTTPS endpoints because this spec preserves Warp's current backend-routed BYOK request architecture. + +Relevant code in the current client: + +- `crates/ai/src/api_keys.rs:20` defines the locally persisted BYOK key shape. It currently contains fixed provider slots, including `google`, `anthropic`, `openai`, and `open_router`. +- `crates/ai/src/api_keys.rs:120` builds the `warp_multi_agent_api::request::settings::ApiKeys` payload for agent requests and returns `None` when no request credentials are present. +- `app/src/settings_view/ai_page.rs:6274` defines `ApiKeysWidget`, the Settings > AI widget that renders fixed provider key editors. +- `app/src/settings_view/ai_page.rs:6417` renders each API key input, and `app/src/settings_view/ai_page.rs:6454` adds the existing OpenAI, Anthropic, and Google inputs. +- `app/src/ai/llms.rs:28` marks a model as using a user API key when BYOK is enabled and the model provider has a matching stored key. +- `app/src/ai/llms.rs:87` defines fixed client-side `LLMProvider` variants. +- `app/src/terminal/input/models/data_source.rs:224` builds model picker rows, clears upgrade disablement for BYOK-capable provider models, and renders BYOK/manage affordances. +- `app/src/terminal/input/models/data_source.rs:494` limits the "bring your own key" upsell to fixed providers. +- `app/src/ai/agent/api.rs:156` creates `RequestParams` for Warp Agent requests. +- `app/src/ai/agent/api.rs:237` pulls BYOK request credentials from `ApiKeyManager`. +- `app/src/ai/agent/api/impl.rs:59` serializes the final `warp_multi_agent_api::Request`, including selected model IDs and request API keys. +- `crates/warp_graphql_schema/api/schema.graphql:1913` lists server `LlmProvider` enum values used for server-provided models. +- `crates/warp_graphql_schema/api/schema.graphql:1927` already has `LlmSettingsInput` fields such as `apiKey` and `baseUrl` for workspace-level LLM settings, but the client workspace conversion currently stores only host-level enablement in `app/src/workspaces/workspace.rs:619`. + +## Proposed changes + +### 1. Add a custom endpoint config type + +Add a client-owned settings type to `crates/ai/src/api_keys.rs`: + +```rust +pub struct OpenAICompatibleEndpoint { + pub label: Option, + pub base_url: Option, + pub api_key: Option, + pub model_id: Option, +} +``` + +Store it alongside existing provider keys in `ApiKeys`: + +```rust +pub openai_compatible_endpoint: Option +``` + +Implementation notes: + +- Keep this in secure storage with the other BYOK credentials because it includes an API key. +- Treat `label` as optional; the display layer can default to `OpenAI-compatible`. +- Add helper methods such as `is_complete()` and `display_label()` to centralize validation. +- Consider a vector shape later, but keep the initial UI and request payload to one endpoint unless maintainers prefer a list immediately. + +### 2. Render the endpoint editor in Settings > AI + +Extend `ApiKeysWidget` in `app/src/settings_view/ai_page.rs` with editors for: + +- label +- base URL +- API key +- model ID + +Use the existing single-line editor pattern from `create_api_key_editor!` for consistency. The API key field should stay password-style; label, base URL, and model ID should be plain text inputs. + +Save behavior: + +- On blur or Enter, persist the full config via `ApiKeyManager`. +- Empty base URL, API key, or model ID keeps the config incomplete. +- If BYOK is disabled for the workspace, preserve the stored endpoint config but disable endpoint editing and model selection using the same `UserWorkspacesEvent::TeamsChanged` handling as the fixed provider key fields. Do not clear the stored config unless the user explicitly deletes it. + +Validation: + +- Parse base URL with `url::Url`. +- Accept only absolute `https` URLs in V1. +- Reject obvious local/private hosts client-side, including `localhost`, loopback IPs, and missing hosts. +- Do not send a validation request to the provider while saving settings. + +### 3. Add a synthetic custom model choice + +Warp's current model picker is built around `LLMInfo` choices. Add a synthetic `LLMInfo` when the custom endpoint config is complete. + +Recommended shape: + +- `id`: the configured model ID, preferably with a stable custom prefix if the backend needs to distinguish custom endpoint models from server-known models. +- `display_name`: configured label plus model ID, for example `OpenRouter: anthropic/claude-sonnet-4.5`. +- `base_model_name`: configured model ID. +- `provider`: either a new `LLMProvider::OpenAICompatible` client variant or `LLMProvider::Unknown` plus a separate custom-model marker. +- `disable_reason`: `None` when BYOK is enabled and the config is complete. +- `host_configs`: default/direct host configuration unless the backend requires a distinct custom host. + +The least surprising model-picker behavior is to inject the synthetic choice at the `LLMPreferences` boundary where server-provided model choices are already cached and exposed to the UI. If maintainers prefer to keep `LLMPreferences` server-only, the alternative is to append the custom row in `app/src/terminal/input/models/data_source.rs`, but that risks duplicating model-selection behavior across surfaces. + +### 4. Extend request payloads with endpoint metadata + +The selected model ID alone is not enough; the request layer must also know base URL and API key. Use a distinct request settings field instead of overloading fixed provider API keys. + +Add an optional `custom_model_endpoint` payload to the agent request settings: + +```protobuf +message CustomModelEndpoint { + string provider_label = 1; + string base_url = 2; + string api_key = 3; + string model_id = 4; +} + +message Settings { + // Existing fields... + optional CustomModelEndpoint custom_model_endpoint = ; +} +``` + +This is clearer than extending `ApiKeys` because the custom endpoint is routing metadata, not only a credential. It also avoids overloading fixed provider-key semantics. + +Versioning and compatibility: + +- The field is optional and absent for all existing request paths. +- Older clients continue sending the existing settings payload without `custom_model_endpoint`. +- The backend must deploy support for the optional field before the client starts sending it. +- The client only sends `custom_model_endpoint` when the selected active model is the synthetic custom endpoint model and BYOK is enabled for the workspace. +- If the backend does not support the field or rejects the endpoint, it should return a user-facing custom endpoint error instead of falling back silently to Warp credits. + +In the client: + +- Extend `RequestParams` in `app/src/ai/agent/api.rs` with optional custom endpoint metadata. +- Populate it from `ApiKeyManager` only when the active model is the custom endpoint model. +- Serialize it in `app/src/ai/agent/api/impl.rs` alongside existing `settings.model_config` and `settings.api_keys`. + +### 5. Add backend egress and logging safeguards + +Because Warp's backend would route requests to a user-provided URL, syntax-only client validation is insufficient. Backend validation must run before every outbound request, including redirects. + +Required backend safeguards: + +- Allow only `https` endpoints in V1. +- Reject URLs with embedded credentials, fragments, or query parameters in the configured base URL. +- Resolve the hostname server-side and reject private, loopback, link-local, multicast, carrier-grade NAT, documentation/test, and otherwise non-public IP ranges for both IPv4 and IPv6. +- Explicitly reject cloud metadata endpoints such as `169.254.169.254` and IPv6 link-local metadata equivalents. +- Reject `localhost`, `.local`, and direct IP literals that resolve to non-public ranges. +- Prevent DNS rebinding by pinning the validated address for the outbound connection or re-validating the resolved address immediately before connect. +- Disable redirects by default, or re-run the full validation policy on every redirect target before following it. +- Enforce short connection and total request timeouts, response-size limits, and streaming idle timeouts. +- Redact API keys, authorization headers, and any configured endpoint URL from logs, traces, telemetry, error reporting, and Oz/agent-visible debug output. If an error needs to name the endpoint, use the provider label and model ID instead of the full URL. +- Emit structured, non-secret error categories for invalid endpoint, blocked endpoint, authentication failure, endpoint timeout, and unsupported response shape. + +### 6. Error handling and display + +Existing invalid-key handling maps provider names for fixed providers in `app/src/ai/blocklist/controller.rs`. Add a custom endpoint path that can show the configured label when available. + +Expected user-facing behavior: + +- Authentication failure: "Invalid API key for