A lightweight CLI to run LiteLLM proxy and wire Claude Code.
LiteLLM is launched on-demand via uvx (no global installation needed).
- 🚀 Zero-config start — run
trifle-proxy startand Claude Code is instantly wired to your local proxy - 🔌 Multi-provider — OpenAI, Anthropic, Moonshot, DeepSeek, Google Gemini, and more via LiteLLM
- 🏠 Per-project or global — wire Claude Code globally, per-project via
direnv, or via shell env - ⚙️ Interactive init —
trifle-proxy initwalks you through creatinglitellm.yaml - 🪟 Cross-platform — works on macOS, Linux, and Windows
# Install trifle-proxy
pipx install trifle-proxy
# or
uv tool install trifle-proxy
# Create config
trifle-proxy init
# Set API keys
export MOONSHOT_API_KEY=your_key
export LITELLM_MASTER_KEY=sk-local-claude-code
# Start proxy + wire Claude Code
trifle-proxy start
# Done! Now run `claude` anywhere and it will use your local proxy
# Stop when done
trifle-proxy stoppipx install trifle-proxyuv tool install trifle-proxypipx install git+https://github.com/madlexa/trifle-proxy.gittrifle-proxy --help # Show help
trifle-proxy --version # Show version
trifle-proxy init # Interactive config wizard
trifle-proxy init --template # Create from template without questions
trifle-proxy start # Start proxy + wire Claude Code globally
trifle-proxy start --mode envrc # Wire via .envrc (direnv) — per-project
trifle-proxy start --mode shell # Wire via ~/.zshenv — global shell env
trifle-proxy stop # Stop proxy (graceful) + unwire everything
trifle-proxy status # Check proxy status
trifle-proxy health # Probe health; auto-rollback if crashed
trifle-proxy logs # Tail proxy logs
trifle-proxy metrics # Show Prometheus metrics
trifle-proxy validate # Validate litellm.yamlThese apply to any command:
-v, --verbose # Debug-level logging
--log-level {debug,info,warning,error,critical}
--log-json # Emit logs as JSON linestrifle-proxy health # status: healthy | unhealthy | stopped
trifle-proxy metrics --local # render trifle-proxy's in-process registry
trifle-proxy metrics --url URL # scrape a Prometheus endpoint (e.g. LiteLLM's /metrics)trifle-proxy tracks process-lifecycle counters in an in-process registry
(--local). Live per-request metrics come from LiteLLM itself — point
--url at LiteLLM's /metrics endpoint to scrape them.
health combines a process-liveness check with an HTTP liveliness probe. If the
proxy process is gone but Claude Code is still wired to it, health restores
your ~/.claude/settings.json from backup so you are never left pointed at a
dead endpoint.
trifle-proxy init creates a litellm.yaml in the current directory. Example:
model_list:
- model_name: kimi-k2.5
litellm_params:
model: moonshot/kimi-k2.5
api_key: os.environ/MOONSHOT_API_KEY
api_base: https://api.moonshot.ai/v1
model_info:
claude_role: sonnet
- model_name: deepseek-chat
litellm_params:
model: deepseek/deepseek-chat
api_key: os.environ/DEEPSEEK_API_KEY
api_base: https://api.deepseek.com
model_info:
claude_role: haiku
litellm_settings:
drop_params: true
master_key: os.environ/LITELLM_MASTER_KEYRuntime resilience for upstream providers — retries, provider cooldown
(circuit breaking), and model fallbacks — is handled by LiteLLM's
router_settings block, which the init template emits and LiteLLM consumes.
The separate resilience: section below configures trifle-proxy's own library
primitives (proxy.start_resilient, resilience.CircuitBreaker, etc.). These
are provided and unit-tested but not yet wired into the default start
path — see docs/architecture.md. Setting it has no
effect on the current CLI; it is reserved for callers that invoke those
primitives directly. Any missing key falls back to a sane default.
resilience:
retry:
max_attempts: 3
base_delay: 0.5
max_delay: 30.0
multiplier: 2.0
jitter: 0.0
circuit_breaker:
failure_threshold: 5
recovery_timeout: 30.0
success_threshold: 1
fallback_models:
- deepseek-chatmodel_info.claude_role tells trifle-proxy which Claude Code model tier to map to:
claude_role |
Claude Code env var |
|---|---|
opus |
ANTHROPIC_DEFAULT_OPUS_MODEL |
sonnet |
ANTHROPIC_MODEL + ANTHROPIC_DEFAULT_SONNET_MODEL |
haiku |
ANTHROPIC_DEFAULT_HAIKU_MODEL |
subagent |
CLAUDE_CODE_SUBAGENT_MODEL |
trifle-proxy modifies ~/.claude/settings.json:
- Backs up your existing settings
- Injects
env.ANTHROPIC_BASE_URLand model mappings - Claude Code reads these on startup — no shell env needed
- On
stop, the backup is restored andenvis removed
Creates a .envrc file for direnv. Great for per-project isolation.
# Install direnv
brew install direnv
# Add to ~/.zshrc
eval "$(direnv export zsh)"
# Now `cd` into your project activates the proxy env automaticallyAppends export statements to ~/.zshenv. Works in any new terminal window.
- Python 3.10+
- One of:
pipx,uv, orpip uv(recommended) — LiteLLM will be launched viauvxwithout global installation- API keys for the LLM providers you want to use
export LITELLM_MASTER_KEY=sk-local-claude-codeThis can be any string — it's used to authenticate local requests.
trifle-proxy validate # Check your litellm.yaml
trifle-proxy logs # See error messagesMake sure Claude Code is not running when you trifle-proxy start. It reads ~/.claude/settings.json on startup. Restart Claude Code after starting the proxy.
You can confirm the proxy is reachable:
trifle-proxy health # should report status: healthyThe process is up but not yet serving. LiteLLM can take a few seconds to load on
first run (especially when fetched via uvx). Wait and re-run trifle-proxy health, and check trifle-proxy logs for upstream errors (bad API key, wrong
api_base).
Run trifle-proxy health — when the process is gone but the wiring remains, it
restores your settings from backup automatically. You can also force cleanup
with trifle-proxy stop.
Another proxy (or a stale process) holds the port. Stop it, or start on a different port:
trifle-proxy stop
trifle-proxy start --port 4100Install direnv and hook it into your shell:
brew install direnv
echo 'eval "$(direnv export zsh)"' >> ~/.zshrc # then restart the shelltrifle-proxy start --mode envrc runs direnv allow for you if direnv is
installed.
Backups are written to ~/.claude/backups/settings.backup.<timestamp>.json
before each wire. If anything goes wrong, the most recent backup is your
pre-wire state.
- Architecture — components, data flow, failure handling
- Contributing — local setup, quality gates, PR process
- Security policy — reporting vulnerabilities, security model
- Changelog — release history
make install # editable install with dev dependencies
make check # lint + typecheck + security + test (all CI gates)
make test # pytest with coverageSee CONTRIBUTING.md for details.
MIT