feat(backend): support a custom OpenAI-compatible provider in BYOK (#6878)#8466
feat(backend): support a custom OpenAI-compatible provider in BYOK (#6878)#8466ZachL111 wants to merge 11 commits into
Conversation
…t gating, and routing
There was a problem hiding this comment.
2 issues found across 4 files
Confidence score: 2/5
- In
backend/utils/byok.py, the SSRF check only inspects the input hostname/IP string and does not validate the DNS-resolved destination, so a benign-looking host could resolve to private/internal or metadata addresses and still be requested, creating a real network-exposure path if merged as-is — resolve and validate final IPs (including DNS rebinding protections) before allowing custom provider calls. - In
backend/utils/llm/clients.py(get_llm), the custom BYOK early return skips prompt-cache binding and the downstream Anthropic/Perplexity provider guard, which can cause inconsistent behavior and bypass intended safety checks for those providers — route BYOK through the same post-resolution guard/cache path (or apply equivalent checks before return) before merging.
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
|
Thanks @ZachL111 — nice, well-structured addition, and the test coverage for the new SSRF guard is genuinely good. I'm not approving this one yet because it needs human maintainer review before it can merge, and there are a couple of concrete things worth addressing first. Why this needs human reviewThis introduces a new provider type to BYOK and a new outbound-call surface (the backend calling a user-supplied endpoint with a user key). That touches model/provider routing and security-sensitive surfaces, so I'd want a maintainer to weigh in on the product direction and the security posture rather than rubber-stamp it. Security — SSRF guard has a DNS-resolution gap
|
|
Thanks @Git-on-my-level and cubic, both findings were valid. Pushed fixes:
Appreciate the careful security read. |
There was a problem hiding this comment.
2 issues found across 3 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="backend/utils/byok.py">
<violation number="1" location="backend/utils/byok.py:163">
P2: Resolver exceptions other than `socket.gaierror` are not normalized, so malformed/IDNA hostnames can raise `UnicodeError` and bypass the graceful degradation in `get_byok_custom_provider()`, surfacing as a 500 error instead of falling back to the default provider.</violation>
</file>
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
| # public-looking name pointing at a private/metadata IP cannot slip through. | ||
| try: | ||
| infos = socket.getaddrinfo(host, parsed.port or 443, type=socket.SOCK_STREAM) | ||
| except socket.gaierror as e: |
There was a problem hiding this comment.
P2: Resolver exceptions other than socket.gaierror are not normalized, so malformed/IDNA hostnames can raise UnicodeError and bypass the graceful degradation in get_byok_custom_provider(), surfacing as a 500 error instead of falling back to the default provider.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/utils/byok.py, line 163:
<comment>Resolver exceptions other than `socket.gaierror` are not normalized, so malformed/IDNA hostnames can raise `UnicodeError` and bypass the graceful degradation in `get_byok_custom_provider()`, surfacing as a 500 error instead of falling back to the default provider.</comment>
<file context>
@@ -133,14 +146,30 @@ def validate_custom_base_url(base_url: str) -> str:
+ # public-looking name pointing at a private/metadata IP cannot slip through.
+ try:
+ infos = socket.getaddrinfo(host, parsed.port or 443, type=socket.SOCK_STREAM)
+ except socket.gaierror as e:
+ raise ValueError(f'custom base URL host did not resolve: {e}')
+ for info in infos:
</file context>
| except socket.gaierror as e: | |
| + except (socket.gaierror, UnicodeError) as e: |
|
Thanks cubic. On the synchronous getaddrinfo: That said, I added a per-host TTL cache (5 minutes) around the resolution, so a base URL is resolved at most once per host per window instead of on every |
There was a problem hiding this comment.
1 issue found across 2 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="backend/utils/byok.py">
<violation number="1" location="backend/utils/byok.py:163">
P2: Resolver exceptions other than `socket.gaierror` are not normalized, so malformed/IDNA hostnames can raise `UnicodeError` and bypass the graceful degradation in `get_byok_custom_provider()`, surfacing as a 500 error instead of falling back to the default provider.</violation>
</file>
Tip: Review your code locally with the cubic CLI to iterate faster.
Re-trigger cubic
|
Follow-up review after the latest commits — thanks for addressing the earlier feedback, @ZachL111. What was addressed
Residual item worth noting before mergeThe SSRF validation happens at call time but does not pin the resolved IP through to the LLM client connection. Combined with the 5-minute positive DNS cache, there is a DNS-rebinding window: a host that validates as public can re-resolve to an internal/metadata address between this check and the actual outbound request. The code documents this honestly (connect-time pinning via a dedicated transport is called out as a follow-up). For a brand-new outbound-call surface carrying a user key, I would want a maintainer to decide whether connect-time IP pinning (or a shorter/no positive cache) should land in this PR or as an immediate follow-up before broad rollout. RecommendationStill needs human security + maintainer review before merge — this introduces a new provider type, a new network-exposure surface, and touches model/provider routing, all of which warrant a product and security judgment call. Not approving. The implementation quality and test coverage are genuinely good. |
|
Thanks @ZachL111 — nice work on this. Adding a custom OpenAI-compatible provider to BYOK is a coherent extension of the existing surface, and the hardening around the new SSRF exposure is thoughtful: HTTPS-only, full internal/loopback/link-local/private/reserved/metadata blocking, hostname resolution that rejects any internal resolved address, and the deliberate choice to cache only rejections (re-resolving successes on every call to limit DNS-rebinding windows). The SSRF test coverage is solid, and gating the custom provider behind the existing fingerprint enrollment check is the right call. A few things that need human maintainer eyes before merge:
Keeping |
|
Thanks for the careful security pass, and for separating the engineering hardening from the product-direction call. Your characterization of the residual TOCTOU is exactly right. The proper fix is connect-time IP pinning: resolve and validate once, then bind the outbound connection to that exact validated IP while preserving the original hostname for TLS SNI and certificate verification, via a custom httpx transport passed to the OpenAI client on both the sync and async paths. That removes the second resolution entirely, so the connection cannot be redirected after validation. I am glad to land that in this PR, or split it out as the immediate prerequisite follow-up, whichever you and the maintainer prefer. On points 2 and 3, agreed those are maintainer calls. Custom-provider BYOK expands the trust and abuse surface beyond the four major-lab providers and changes the |
kodjima33
left a comment
There was a problem hiding this comment.
Backend feature: custom OpenAI-compatible BYOK provider — approve only.
Summary
BYOK currently only accepts keys from the major labs (OpenAI, Anthropic, Gemini, Deepgram). This adds a fifth option: a user-hosted, OpenAI-compatible provider, so a BYOK user can run Omi's LLM work on OpenRouter, Together, Groq, a local server, or any endpoint that speaks the OpenAI chat-completions API. Refs #6878.
The provider is configured per-request, alongside the existing BYOK keys:
X-BYOK-Custom: the API key (fingerprinted on enrollment like the other providers)X-BYOK-Custom-Base-Url: the endpoint, e.g.https://openrouter.ai/api/v1X-BYOK-Custom-Model: the model to call, e.g.qwen/qwen-2.5-72b-instructWhen all three are present for an enrolled, BYOK-active user, that user's
get_llmcalls (conversation structuring, memory extraction, summaries, and the other OpenAI-compatible features) run on their endpoint, with their model and their key.How it works
utils/byok.py):customjoins the existing per-request header set; the base URL and model travel as companion config headers and are stashed in the same request contextvar under reserved names, so they never collide with a provider key and are never fingerprint-validated._check_byok_validitydrops the custom key and config unlesscustomis actually enrolled, so an arbitraryX-BYOK-Customheader can never route a user's traffic to an unvalidated endpoint. Non-BYOK users' headers are ignored as before.validate_custom_base_url): the backend makes outbound calls to this URL, so it must be https and must not point at a loopback, link-local, private, or reserved address (including the cloud metadata endpoint169.254.169.254). A bad URL degrades to the default provider instead of failing the request.utils/llm/clients.py):get_llmshort-circuits to aChatOpenAI(base_url, api_key, model)for the custom provider when one is configured. Clients are cached by hashed key plus base URL, so a raw key is never part of a cache key and different endpoints never collide.routers/users.py):customis allowed as an optional additional provider. The required set (openai/anthropic/gemini/deepgram) and the quota model are unchanged.Scope, and one decision for a maintainer
This is intentionally additive and fully opt-in: with no custom headers, behavior is byte-for-byte unchanged.
Two things I scoped deliberately and would like a maintainer to weigh in on, since this is your BYOK feature @kodjima33:
The main agentic chat runs on the Anthropic client path, which does not go through
get_llm, so the custom provider currently powers theget_llmLLM features (extraction, structuring, summaries, and so on) and not the streaming chat agent. Routing the agent to an arbitrary OpenAI-compatible endpoint is a larger change (tool-format differences) and is a sensible follow-up.Today a user must still be BYOK-active (enroll the standard four) to also enroll
custom. That keeps the free-plan and quota model untouched, but it does not yet fully serve the original ask in Add third party LLM providers/gateways other than OpenAI, Anthropic, Google and Deepgram #6878 (people who want to use only a non-lab provider). Decouplingcustomfrom the four-provider requirement is a product call on the free-plan model, so I left it to you rather than changing who qualifies for the BYOK plan.Testing
tests/unit/test_byok_security.pywith aTestBYOKCustomProvidersuite: base-URL validation (https only; rejects localhost,.local, private, and metadata IPs), the per-request getter (full config, missing field, invalid-URL degradation, config-alone is not a usable request), header intake, enrollment accepting an optional custom provider, the enrollment gate dropping un-enrolled custom config while honoring enrolled, client caching by base URL, andget_llmrouting to the custom client.black --line-length 120 --skip-string-normalizationis clean andscripts/lint_async_blockers.pyreports no violations. The activation-endpoint tests that importrouters.usersneed GCP credentials to import and so run green only in CI, the same as the existing BYOK activation tests.