Note
Status: Public Beta. AetherForge runs entirely on-device (Apple Silicon optimised). Zero cloud dependencies. Zero data exfiltration.
AetherForge is a Sovereign Intelligence Layer β a desktop-native AI system that learns, reasons, and calculates entirely on your hardware with no internet required. It is designed for professionals in high-stakes domains where cloud dependency and data leakage are non-negotiable risks.
Unlike standard RAG frameworks, AetherForge implements:
- Native Apple Silicon Inference β zero-copy UMA access via MLX for Gemma 4.
- Stateless Causal Observability β enforced structured tool selection via GBNF grammars.
- Closed-Loop Perpetual Learning β learns from interactions without forgetting previous knowledge using orthogonal projections.
- Deterministic Calculation β numeric queries bypass the LLM and use verified table interpolation.
- Glass-Box Reasoning β every decision is auditable with full reasoning traces exposed.
- Air-Gapped Security β encrypted storage, local telemetry, no external API calls.
AetherForge solves the "Black Box" problem by exposing internal reasoning traces in real-time. Every decision is auditable, traceable, and governed by deterministic policies. The system enforces a hard rule: LLMs explain; they never calculate.
| Layer | Technology | Purpose |
|---|---|---|
| Desktop Shell | Tauri 2.1 (Rust) | Native desktop app with IPC bridge to the backend |
| Frontend | React 18 + TypeScript 5.5 | HUD interface with X-Ray tracing, ThinkingBlock |
| Backend | Python 3.12 + FastAPI | REST API, async document processing, dependency injection |
| Orchestration | ForensicOrchestrator | Stateless modular agent execution with GBNF grammar constraints |
| LLM Inference | Apple MLX (MLXEngine) |
Native Gemma 4 E4B-IT 4-bit inference via zero-copy UMA |
| Vector Store | RuVector GNN-HNSW | Graph Neural Network + Hierarchical Navigable Small World index |
| Sparse Search | SQLite FTS5 | BM25 keyword search for hybrid retrieval |
| Reasoning & Synthesis | InsightForge & RCA | TF-IDF/DBSCAN novelty synthesis, 5-Whys causal reasoning |
| Learning | BitNetTrainer + OPLoRAManager | Orthogonal Projection LoRA on BitNet architectures + SONA |
| Guardrails | Silicon Colosseum + SAMR-lite | OPA/Rego deterministic policies, local semantic faithfulness scorer (threshold 0.55) |
| Telemetry | LangfuseExporter | Local-only observability (localhost:3000 via Docker) |
| Encryption | SQLCipher | AES-256 encrypted session storage |
graph TD
%% Frontend Layer
subgraph "Visual Interface (HUD)"
UI[Tauri/React/Vite]
TuneLab[TuneLab: Learning Monitor]
TraceHUD[X-Ray: Causal Trace]
end
%% Orchestration & Reasoning Layer
subgraph "Cognitive Engine"
Orchestrator[ForensicOrchestrator]
Router[QueryRouter: Intent Classifier]
RagForge[CognitiveRAGβ’ Pipeline]
Insight[InsightForge: Novelty Synthesis]
RCA[RootCauseAgent: 5-Whys]
end
%% Inference & Learning Layer
subgraph "Inference & Learning (Apple Silicon)"
MLX[MLXEngine: Gemma 4 Native UMA]
BitNet[BitNetTrainer: OPLoRA Weights]
SONA[SONA: 3-Tier Real-Time Learning]
RB[Replay Buffer: Parquet/Fernet]
end
%% Trust, Governance & Telemetry Layer
subgraph "Silicon Colosseum & Telemetry"
OPA[OPA: Rego Policies]
SAMR[SAMR-lite: Faithfulness Scorer]
Telemetry[LangfuseExporter: Local Telemetry]
end
%% Storage Layer
subgraph "Storage"
RuVector[RuVector GNN-HNSW]
SQLite[SQLite: FTS5 & Structured Tables]
DocReg[Document Registry + Boot-Sweep]
end
%% Communication
UI <-->|IPC / WebSockets| Orchestrator
Orchestrator --> Router
Orchestrator --> RCA
Orchestrator --> Telemetry
Router --> RagForge
Router --> Insight
RagForge --> RuVector
RagForge --> SQLite
RagForge --> SAMR
SAMR --> OPA
OPA --> RB
RB --> BitNet
SONA -->|Sub-50ms Injection| BitNet
BitNet --> MLX
AetherForge runs Gemma 4 E4B-IT-4bit via a custom MLXEngine for Apple Silicon. This enables zero-copy Unified Memory Architecture (UMA) access, entirely replacing the legacy HTTP round-trip inference. It supports sub-50ms MicroLoRA injection for real-time adaptation.
Replacing legacy generic meta-agents, the ForensicOrchestrator provides stateless, causal observability. It utilizes GBNF grammar constraints to force structured tool selection, guaranteeing deterministic execution paths and perfect traceability for every action.
- InsightForge: Runs a weekly cycle utilizing TF-IDF and DBSCAN algorithms for novelty detection and cross-document synthesis.
- RootCauseAgent (RCA): Implements an iterative 5-Whys causal reasoning chain for deep analytical queries.
To prevent Catastrophic Forgetting, BitNetTrainer and OPLoRAManager manage nightly learning cycles using Orthogonal Projection LoRA (OPLoRA). Gradient updates are projected onto the orthogonal complement of existing knowledge subspaces, ensuring new learning doesn't overwrite past intelligence.
AetherForge rejects probabilistic safety filters in favor of deterministic alignment:
- OPA (Open Policy Agent): Rego policies enforce absolute behavioral boundaries.
- SAMR-lite: A local semantic faithfulness scorer validates responses against grounded evidence. Any response scoring below the
0.55threshold is automatically blocked.
The system implements a mandatory "Boot-Sweep" on every startup, synchronizing the document_registry.db directly with physical disk storage in the data/ directory to prevent "ghost" documents.
Optional local-only telemetry is provided via LangfuseExporter, designed to run entirely on localhost:3000 via Docker, ensuring zero data exfiltration while maintaining deep observability.
AtherForge/
βββ src/ # Python backend (FastAPI)
β βββ core/ # Container, Orchestrator, MLX Engine
β β βββ container.py # Dependency injection
β β βββ mlX_engine.py # Native Apple Silicon inference
β β βββ orchestrator.py # ForensicOrchestrator
β β βββ grammar.py # GBNF grammar constraints
β βββ guardrails/ # Silicon Colosseum
β β βββ silicon_colosseum.py # OPA/Rego policy enforcement
β βββ learning/ # Continual Learning
β β βββ bitnet_trainer.py # Manages OPLoRA weights updates
β β βββ oplora_manager.py # Orthogonal Projection LoRA
β β βββ sona_adapter.py # SONA 3-tier real-time learning
β βββ insights/ # Novelty Synthesis
β β βββ insight_forge.py # TF-IDF/DBSCAN synthesis
β βββ rca/ # Causal Reasoning
β β βββ root_cause_agent.py # 5-Whys iterative chain
β βββ telemetry/ # Observability
β β βββ langfuse_exporter.py # Local Langfuse integration
β βββ modules/ # Plugin Modules
β β βββ ragforge/ # CognitiveRAG pipeline
β β β βββ samr_lite.py # Faithfulness scorer
β β β βββ ruvector_store.py # RuVector CLI bridge
β β βββ document_registry.py # SQLite metadata + Boot-sweep
β βββ main.py # Application entry point
βββ frontend/ # React/Vite/TypeScript HUD
βββ crates/ # Rust components ecosystem
β βββ ruvllm/ # Legacy GGUF inference (fallback)
β βββ ruvector-core/ # Vector storage core
βββ data/ # Persistent Storage (encrypted)
βββ .env # Environment configuration
- Apple Silicon Mac (M1/M2/M3)
- Python 3.12+
- Node.js 20+
- Rust toolchain (for Tauri)
- Docker (optional, for Langfuse Telemetry)
chmod +x install.sh && ./install.sh# Full development stack (backend + frontend + Tauri)
./run_dev.sh
# Backend only
.venv/bin/python -m uvicorn src.main:app --host 127.0.0.1 --port 8765 --reloadKey environment variables (.env):
| Variable | Default | Purpose |
|---|---|---|
MODEL_PATH |
/Volumes/Apple/AI Model/gemma-4-e4b-it-4bit |
Path to Gemma 4 / MLX weights |
DATA_DIR |
data/ |
Persistent storage root |
SQLCIPHER_KEY_FILE |
data/.sqlcipher_key |
Encryption key for sessions |
SILICON_COLOSSEUM_MIN_FAITHFULNESS |
0.55 |
Minimum faithfulness score for SAMR-lite |
LANGFUSE_PUBLIC_KEY |
(empty) | Local Langfuse public key |
MIT License | Built for the Era of Sovereign Intelligence. Runs on your Mac. Learns from your context. Forgets nothing important. No loops, no leaks.