Skip to content

Port OllamaService to the mesh router (OpenAI endpoint) instead of Ollama-native API #164

Description

@jordanpartridge

OllamaService speaks Ollama's native /api/generate, which forces a hardcoded backend host. The mesh has a LiteLLM router (Odin :8101) whose whole job is backend selection with fallback (loki-gpu → odin-cpu → cloud), but it only serves OpenAI/Anthropic-shaped endpoints — so know can't use it today.

Interim state (2026-07-04): repo .env pins OLLAMA_HOST to Loki's GPU directly with OLLAMA_MODEL=llama3.1:8b, OLLAMA_TIMEOUT=120, because the previous default (localhost = Odin CPU, 30s) blew the timeout on every enhancement call — the enhancement queue accumulated thousands of 'empty response' failures partly from this.

Proper fix: switch OllamaService to OpenAI /v1/chat/completions against a configurable base URL + model alias. Then know inherits the router's fallback chain like the rest of the mesh (hird already works this way), and the direct-host pin disappears.

Notes for whoever implements:

  • isAvailable() probes /api/tags — needs an OpenAI-shaped equivalent (/v1/models).
  • enhance() extracts a JSON object from the completion text (see the parse at the end of the service) — unchanged.
  • Router auth: Authorization bearer key, host-side config, never baked into anything.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions