feat: ollama executor for local skill generation#78
Merged
Conversation
Add `ollama:<name>` models as a generation backend: one-shot local completions via Ollama's /api/chat (no agentic tool loop, free, offline). pi-ai has no first-class Ollama provider and its OpenAI-compat path can't set num_ctx, the knob that prevents silent prompt truncation, so we talk to /api/chat directly. - NDJSON streaming with onProgress deltas + token-count capture - discovery via /api/tags, capability-filtered via /api/show to drop embedding-only models (they 500 on /api/chat); appended to getAvailableModels and selectable in config / -m ollama:<name> - num_ctx defaults to 32768 (override OLLAMA_NUM_CTX), host via OLLAMA_HOST
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🔗 Linked issue
No linked issue.
❓ Type of change
📚 Description
Skilld could only generate skills through a CLI (claude/gemini/codex) or a pi-ai API key. This adds
ollama:<name>as a third backend: one-shot local completions via Ollama's/api/chat, so generation is free, offline, and needs no key.pi-ai has no Ollama provider, and its OpenAI-compatible path can't set
num_ctx. That matters because Ollama defaults to a ~4k context and silently truncates our ~16k-token synthesis prompts, so the executor calls/api/chatdirectly and setsnum_ctxto 32768 (override withOLLAMA_NUM_CTX, host withOLLAMA_HOST).Discovery lists locally-pulled models via
/api/tags, then filters out embedding-only models through/api/showsince those 500 on/api/chat. Discovered models show up inskilld configand-m ollama:<name>. The run streams NDJSON deltas to the progress UI and captures token counts.Verified against a live Ollama: discovery listed local models,
embeddinggemmawas filtered out, and a streamed completion returned correct text with token usage. Typecheck and lint clean, 866 unit tests pass.