Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
89 changes: 89 additions & 0 deletions specs/GH4687/product.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Product Spec: OpenAI-compatible BYOK endpoints

Issue: https://github.com/warpdotdev/warp/issues/4687
Related PR: https://github.com/warpdotdev/warp/pull/9253
Figma: none provided

## Summary

Users should be able to configure a custom OpenAI-compatible model endpoint for Warp Agent by entering a provider label, base URL, API key, and model ID. This gives users a narrow, predictable way to use OpenRouter, LiteLLM, internal OpenAI-compatible gateways, and local OpenAI-compatible servers without requiring Warp to add a bespoke provider integration for each one.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 [CRITICAL] This promises local OpenAI-compatible servers, but the tech flow sends endpoint metadata to Warp's backend; localhost would resolve from the backend, not the user's machine. Specify client-side/local routing for local endpoints or remove local servers from the first-version scope.


This spec is intentionally scoped to one generic "OpenAI-compatible" provider surface. It does not replace Warp-provided models, existing BYOK provider keys, enterprise model configuration, or future first-class provider integrations.

## Problem

Warp currently exposes BYOK for fixed providers and chooses models from server-provided model choices. That works when the desired provider and model are already known to Warp, but it leaves no self-service path for users who already have access to an OpenAI-compatible endpoint.

The common desired workflow is:

1. Use Warp's existing agent UX, permissions, execution profiles, and terminal context.
2. Point model requests at a compatible endpoint such as `https://openrouter.ai/api/v1`.
3. Provide an API key for that endpoint.
4. Enter the model ID used by that endpoint, such as `anthropic/claude-sonnet-4.5`.

PR #9253 is useful partial progress for OpenRouter-specific BYOK, but it still depends on Warp-approved model choices. The remaining user need is the generic endpoint contract from issue #4687.

## Goals

1. A user can add one custom OpenAI-compatible endpoint configuration from Settings > AI.
2. The configuration includes:
- provider label
- base URL
- API key
- model ID
3. The configured model appears in the existing model picker as a selectable BYOK model.
4. Selecting the model causes Warp Agent requests to use the configured endpoint and model ID.
5. The feature works for OpenRouter and other OpenAI-compatible endpoints that support Warp's required chat/agent request shape.
6. Existing OpenAI, Anthropic, Google, OpenRouter, AWS Bedrock, and Warp-credit model behavior remains unchanged.

## Non-goals

1. Fetching arbitrary provider model catalogs client-side.
2. Adding separate first-class UI for OpenRouter, LiteLLM, Ollama, Azure, or any other provider in this initial flow.
3. Supporting non-OpenAI-compatible protocols in this feature.
4. Guaranteeing every model behind a compatible endpoint supports all Warp Agent tools.
5. Changing the paid plan or workspace policy that gates BYOK access.
6. Changing where Warp Agent requests are executed or proxied beyond the existing BYOK request architecture.
7. Supporting multiple custom endpoint profiles in the first version. The UI and data model should not preclude this as a follow-up.

## Behavior

1. When BYOK is available for the current user or workspace, Settings > AI shows a section for "OpenAI-compatible endpoint" in addition to the existing provider API key inputs.
2. The section contains four inputs:
- Label, defaulting to `OpenAI-compatible`
- Base URL, for example `https://openrouter.ai/api/v1`
- API key
- Model ID, for example `anthropic/claude-sonnet-4.5`
3. Empty label uses the default label. Empty base URL, API key, or model ID means the custom endpoint is incomplete and should not appear as a selectable model.
4. Base URL validation is lightweight and user-facing: the value must parse as an absolute `http` or `https` URL. Warp does not perform a network validation request when the user saves the setting.
5. API key input uses the same password-style treatment as existing BYOK provider key fields.
6. The saved custom model appears in the existing model picker using the configured label and model ID. A key icon or equivalent BYOK affordance should make clear that it is billed to the user's endpoint credentials.
7. Selecting the custom model persists through the same execution profile mechanism as other model choices.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] Define what happens when a selected custom endpoint is edited, made incomplete, or deleted; otherwise execution profiles can persist a synthetic model selection whose routing behavior is unspecified.

8. Agent requests for the custom model include the custom endpoint metadata needed by Warp's backend to route the request:
- base URL
- API key
- model ID
- provider label for display/diagnostics
9. If the endpoint returns an authentication error, the user sees an invalid API key/error state that names the configured provider label when available.
10. If the endpoint or model is unsupported by Warp's agent backend, the error should explain that the custom endpoint could not satisfy the request rather than asking the user to upgrade Warp credits.
11. Existing fixed-provider BYOK fields continue to work. Adding a custom endpoint does not clear or override OpenAI, Anthropic, Google, OpenRouter, or AWS Bedrock credentials.
12. Disabling BYOK at the workspace/plan level disables custom endpoint editing and selection using the same gating behavior as existing BYOK fields.

## Success criteria

1. A user can configure OpenRouter with:
- Base URL: `https://openrouter.ai/api/v1`
- Model ID: an OpenRouter model slug
- API key: an OpenRouter key
2. The configured model can be selected in the model picker.
3. Agent requests with that model carry the custom endpoint config to the request layer.
4. Incomplete custom endpoint config does not create a broken model picker entry.
5. Existing BYOK provider-key behavior is unchanged.
6. Existing Warp-credit model behavior is unchanged.

## Open questions

1. Should the first implementation support exactly one custom endpoint, or should the persistence shape support a list immediately while the UI initially exposes one?
2. Should custom endpoint metadata live in the existing `api_keys` request payload or a distinct `custom_model_endpoint` request field?
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] The request contract is still an open question even though implementation depends on it. Resolve whether endpoint metadata is a distinct request field or part of ApiKeys, including versioning/backward compatibility with the backend.

3. Should the backend allow custom endpoint routing for all agent features immediately, or gate specific tool-heavy flows until compatibility is proven?
4. Should Warp add a preset button for OpenRouter after the generic flow lands, or keep the first version purely generic?
232 changes: 232 additions & 0 deletions specs/GH4687/tech.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,232 @@
# Tech Spec: OpenAI-compatible BYOK endpoints

Issue: https://github.com/warpdotdev/warp/issues/4687
Product spec: `specs/GH4687/product.md`

## Context

Warp already has the main pieces needed for provider-specific BYOK:

- secure local credential storage
- Settings > AI provider key inputs
- server-provided model choices
- model-picker BYOK affordances
- request-time API key payloads sent with Warp Agent requests

The missing piece for issue #4687 is a custom endpoint model entry that carries a user-provided base URL and model ID, instead of requiring the model to exist in Warp's server-approved model list.

Relevant code in the current client:

- `crates/ai/src/api_keys.rs:20` defines the locally persisted BYOK key shape. It currently contains fixed provider slots, including `google`, `anthropic`, `openai`, and `open_router`.
- `crates/ai/src/api_keys.rs:120` builds the `warp_multi_agent_api::request::settings::ApiKeys` payload for agent requests and returns `None` when no request credentials are present.
- `app/src/settings_view/ai_page.rs:6274` defines `ApiKeysWidget`, the Settings > AI widget that renders fixed provider key editors.
- `app/src/settings_view/ai_page.rs:6417` renders each API key input, and `app/src/settings_view/ai_page.rs:6454` adds the existing OpenAI, Anthropic, and Google inputs.
- `app/src/ai/llms.rs:28` marks a model as using a user API key when BYOK is enabled and the model provider has a matching stored key.
- `app/src/ai/llms.rs:87` defines fixed client-side `LLMProvider` variants.
- `app/src/terminal/input/models/data_source.rs:224` builds model picker rows, clears upgrade disablement for BYOK-capable provider models, and renders BYOK/manage affordances.
- `app/src/terminal/input/models/data_source.rs:494` limits the "bring your own key" upsell to fixed providers.
- `app/src/ai/agent/api.rs:156` creates `RequestParams` for Warp Agent requests.
- `app/src/ai/agent/api.rs:237` pulls BYOK request credentials from `ApiKeyManager`.
- `app/src/ai/agent/api/impl.rs:59` serializes the final `warp_multi_agent_api::Request`, including selected model IDs and request API keys.
- `crates/warp_graphql_schema/api/schema.graphql:1913` lists server `LlmProvider` enum values used for server-provided models.
- `crates/warp_graphql_schema/api/schema.graphql:1927` already has `LlmSettingsInput` fields such as `apiKey` and `baseUrl` for workspace-level LLM settings, but the client workspace conversion currently stores only host-level enablement in `app/src/workspaces/workspace.rs:619`.

## Proposed changes

### 1. Add a custom endpoint config type

Add a client-owned settings type to `crates/ai/src/api_keys.rs`:

```rust
pub struct OpenAICompatibleEndpoint {
pub label: Option<String>,
pub base_url: Option<String>,
pub api_key: Option<String>,
pub model_id: Option<String>,
}
```

Store it alongside existing provider keys in `ApiKeys`:

```rust
pub openai_compatible_endpoint: Option<OpenAICompatibleEndpoint>
```

Implementation notes:

- Keep this in secure storage with the other BYOK credentials because it includes an API key.
- Treat `label` as optional; the display layer can default to `OpenAI-compatible`.
- Add helper methods such as `is_complete()` and `display_label()` to centralize validation.
- Consider a vector shape later, but keep the initial UI and request payload to one endpoint unless maintainers prefer a list immediately.

### 2. Render the endpoint editor in Settings > AI

Extend `ApiKeysWidget` in `app/src/settings_view/ai_page.rs` with editors for:

- label
- base URL
- API key
- model ID

Use the existing single-line editor pattern from `create_api_key_editor!` for consistency. The API key field should stay password-style; label, base URL, and model ID should be plain text inputs.

Save behavior:

- On blur or Enter, persist the full config via `ApiKeyManager`.
- Empty base URL, API key, or model ID keeps the config incomplete.
- If BYOK is disabled for the workspace, clear/disable the endpoint editors using the same `UserWorkspacesEvent::TeamsChanged` handling as the fixed provider key fields.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] clear/disable conflicts with the product spec's non-destructive gating. Specify that BYOK-disabled workspaces preserve the stored custom endpoint but disable editing/selection unless the user explicitly deletes it.


Validation:

- Parse base URL with `url::Url`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 [CRITICAL] Syntax-only URL validation is not enough for a backend-routed arbitrary endpoint. The spec needs server-side egress rules for private/link-local/metadata IPs, redirects, DNS rebinding, timeouts, and redaction of the URL/API key from logs before implementation.

- Accept only `http` and `https` schemes.
- Do not send a validation request to the provider while saving settings.

### 3. Add a synthetic custom model choice

Warp's current model picker is built around `LLMInfo` choices. Add a synthetic `LLMInfo` when the custom endpoint config is complete.

Recommended shape:

- `id`: the configured model ID, preferably with a stable custom prefix if the backend needs to distinguish custom endpoint models from server-known models.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] Make the custom model identity requirement deterministic rather than optional; without a required stable prefix or separate custom marker, model IDs can collide with server-provided IDs and route requests through the wrong path.

- `display_name`: configured label plus model ID, for example `OpenRouter: anthropic/claude-sonnet-4.5`.
- `base_model_name`: configured model ID.
- `provider`: either a new `LLMProvider::OpenAICompatible` client variant or `LLMProvider::Unknown` plus a separate custom-model marker.
- `disable_reason`: `None` when BYOK is enabled and the config is complete.
- `host_configs`: default/direct host configuration unless the backend requires a distinct custom host.

The least surprising model-picker behavior is to inject the synthetic choice at the `LLMPreferences` boundary where server-provided model choices are already cached and exposed to the UI. If maintainers prefer to keep `LLMPreferences` server-only, the alternative is to append the custom row in `app/src/terminal/input/models/data_source.rs`, but that risks duplicating model-selection behavior across surfaces.

### 4. Extend request payloads with endpoint metadata

The selected model ID alone is not enough; the request layer must also know base URL and API key. There are two viable server contracts:

Option A: extend `warp_multi_agent_api::request::settings::ApiKeys`:

```protobuf
message OpenAICompatibleEndpoint {
string label = 1;
string base_url = 2;
string api_key = 3;
string model_id = 4;
}
```

and include it as an optional field under `ApiKeys`.

Option B: add a distinct request settings field:

```protobuf
message CustomModelEndpoint {
string provider_label = 1;
string base_url = 2;
string api_key = 3;
string model_id = 4;
}
```

and send it independently from fixed provider keys.

Option B is clearer because this is not only an API key; it is routing metadata. It also avoids overloading fixed provider-key semantics.

In the client:

- Extend `RequestParams` in `app/src/ai/agent/api.rs` with optional custom endpoint metadata.
- Populate it from `ApiKeyManager` only when the active model is the custom endpoint model.
- Serialize it in `app/src/ai/agent/api/impl.rs` alongside existing `settings.model_config` and `settings.api_keys`.

### 5. Error handling and display

Existing invalid-key handling maps provider names for fixed providers in `app/src/ai/blocklist/controller.rs`. Add a custom endpoint path that can show the configured label when available.

Expected user-facing behavior:

- Authentication failure: "Invalid API key for <label>".
- Provider/model failure: endpoint-specific error that names the configured label/model ID when possible.
- Plan/BYOK disabled: existing BYOK upgrade/disabled behavior.

### 6. Keep #9253 compatible

If #9253 lands first, keep its OpenRouter fixed-provider support as a separate convenience path. The generic endpoint should not depend on `LLMProvider::OpenRouter` or server-approved OpenRouter model IDs.

If maintainers prefer to avoid both surfaces, #9253 can become a preset over the generic endpoint:

- label: `OpenRouter`
- base URL: `https://openrouter.ai/api/v1`
- model ID: user-entered
- API key: OpenRouter key

## End-to-end flow

```mermaid
sequenceDiagram
participant User
participant Settings as Settings > AI
participant Keys as ApiKeyManager
participant Picker as Model picker
participant Agent as Agent request builder
participant Server as Warp backend
participant Provider as Compatible endpoint

User->>Settings: Enter label, base URL, API key, model ID
Settings->>Keys: Persist secure custom endpoint config
Picker->>Keys: Read complete endpoint config
Picker-->>User: Show custom BYOK model row
User->>Picker: Select custom model
Agent->>Keys: Read endpoint config for active model
Agent->>Server: Send model ID + endpoint metadata
Server->>Provider: Route OpenAI-compatible request
Provider-->>Server: Model response
Server-->>Agent: Stream Warp Agent response events
```

## Testing and validation

Product behavior mapping:

1. Settings editor renders the four custom endpoint fields when BYOK is available.
- Add settings-view tests if an existing harness covers `ApiKeysWidget`; otherwise validate manually in Settings > AI.
2. Incomplete config does not produce a model picker entry.
- Unit test `OpenAICompatibleEndpoint::is_complete()`.
- Unit test synthetic model injection/filtering.
3. Complete config produces one model picker entry with the expected display label and model ID.
- Unit test the model source or `LLMPreferences` injection point.
4. Selecting the custom model causes request params to include endpoint metadata.
- Unit test `RequestParams::new` or a focused helper that decides whether a selected model matches the custom endpoint.
5. Existing provider-key behavior is unchanged.
- Existing `ApiKeyManager::api_keys_for_request` behavior should keep returning fixed provider keys.
- Add regression coverage that OpenAI/Anthropic/Google/OpenRouter keys are not cleared when custom endpoint config changes.
6. Base URL validation rejects invalid or non-HTTP(S) values.
- Unit test accepted examples:
- `https://openrouter.ai/api/v1`
- `http://localhost:11434/v1`
- Unit test rejected examples:
- `not a url`
- `file:///tmp/model`

Manual validation after implementation:

- Configure OpenRouter with `https://openrouter.ai/api/v1`, an OpenRouter API key, and a known model ID.
- Select the custom model in an execution profile.
- Send a simple Warp Agent prompt.
- Confirm the request uses the custom endpoint path and does not consume Warp credits unless the configured fallback explicitly does so.

## Risks and mitigations

- Risk: custom endpoints may not support the complete Warp Agent protocol or tool expectations.
Mitigation: scope the first version to OpenAI-compatible chat/model routing and return clear provider/model errors when the endpoint cannot satisfy a request.
- Risk: users may expect local-only execution.
Mitigation: copy should say this follows Warp's BYOK request flow and should not promise local-only routing unless the backend/client architecture changes.
- Risk: model IDs collide with server-provided IDs.
Mitigation: use an internal custom model prefix or a separate marker to distinguish custom endpoint selections.
- Risk: storing routing metadata in `ApiKeys` overloads a fixed-provider key store.
Mitigation: keep the initial secure-storage location for safety, but model the endpoint as a distinct typed config and prefer a distinct request field.
- Risk: #9253 and the generic endpoint surface duplicate OpenRouter UX.
Mitigation: keep the generic flow as the base capability and treat OpenRouter-specific UI as a preset or convenience layer.

## Follow-ups

- Multiple saved custom endpoint profiles.
- Optional OpenRouter preset that pre-fills the base URL.
- Optional model catalog import for endpoints that expose a compatible `/models` endpoint.
- Per-profile custom endpoint selection if execution profiles need separate endpoint configs.