Skip to content

Routing UI persists fallback entries as bare model names; proxy resolves them to the wrong provider #1757

@el-analista

Description

@el-analista

Summary

When a fallback row is added in the Routing UI from a specific provider's catalog row (e.g., the Ollama Cloud kimi-k2.6 entry, distinct from the opencode-go Kimi K2.6 subscription row), the entry is persisted into tier_assignments.fallback_models as a bare model name instead of <provider>/<model>. At runtime ProxyFallbackService resolves bare names by global model→provider catalog and routes to a different provider than the UI visually implied — sometimes silently (if no key for the resolved provider), sometimes to a working-but-wrong upstream.

Reproduction

  1. Connect two providers that both serve kimi-k2.6: opencode-go (subscription) and ollama-cloud (subscription). Do not register a Moonshot API key.
  2. Routing → Default → Complex tier → primary MiMo-V2.5-Pro. Add fallback Kimi K2.6 (opencode-go subscription row). Add fallback again, pick the catalog kimi-k2.6 row (the lower one, NOT marked "Included in subscription").
  3. Query the DB:
    SELECT tier, override_model, fallback_models
      FROM tier_assignments WHERE tier = 'complex';
    Result:
    complex | opencode-go/mimo-v2.5-pro | ["opencode-go/kimi-k2.6","kimi-k2.6"]
    
    The second fallback entry has no provider prefix.
  4. Make opencode-go return a fallback-triggering error (e.g., CreditsError 401). Container logs show:
    Fallback 0: trying model=opencode-go/kimi-k2.6 provider=opencode-go ... → fails
    Fallback 1: skipping model=kimi-k2.6 provider=Moonshot (no API key)
    Fallback chain exhausted: CreditsError
    
    The Ollama Cloud row the user picked is never tried.
  5. Manually rewrite to ollama-cloud/kimi-k2.6 and restart manifest. Logs:
    Fallback 1: trying model=ollama-cloud/kimi-k2.6 provider=ollama-cloud auth_type=subscription
    Forwarding to ollama-cloud: https://ollama.com/v1/chat/completions → 200
    

Latent variant (silent wrong-routing instead of silent skip)

For simple/standard tiers with bare minimax-m2.7 fallback, the resolver maps to direct MiniMax api_key, not the UI-implied ollama-cloud/minimax-m2.7. Fallback fires successfully but to a different upstream. If the user later removes their MiniMax key, the chain dies silently — same failure mode as above, just delayed.

Expected behavior

  • UI should always persist fallback entries as <providerId>/<modelId> (or structured {providerId, modelId}) so the resolver never disambiguates by global catalog.
  • ProxyFallbackService should treat bare entries as a configuration error (warn loudly) rather than silently re-resolving via global catalog.

Candidate fix areas

  • Frontend tier-assignments save handler — bind the chosen providerId to each row before persisting.
  • TierAssignmentService.setFallbacks(...) — validate that every fallback_models[i] is <providerId>/<modelId>; reject/normalize bare entries.
  • One-time migration to backfill prefixes for existing bare rows by looking up the model's owning provider via user_providers for that agent.

Environment

  • manifestdotbuild/manifest:latest
  • Postgres 16-alpine
  • Providers connected: opencode-go (subscription), ollama-cloud (subscription), MiniMax (api_key + subscription), anthropic (subscription), openai (subscription), openrouter (api_key)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions