You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Run Claude Code 100% on-device with local AI on Apple Silicon. MLX-native Anthropic-API server, 65 tok/s Qwen 3.5 122B, Llama 3.3 70B, Gemma 4 31B. Private, offline, airgap-ready. Built for NDA / legal / healthcare workflows.
Reproducible vLLM recipe for shawnw3i/Huihui-Qwen3.6-27B-abliterated-AWQ-MTP on 2× RTX 3090 in a Proxmox LXC. MTP n=3, 256K context, full vision+tool-calling+reasoning. Silent 24/7 operation at 250W per card. Companion to the base-model recipe.