Local-Code-Guardian

A local, offline-first code auditing system with incremental indexing (MD5-based) and in-memory retrieval, designed for consumer GPUs.

Hardware target

GPU: NVIDIA RTX 3060 Laptop (6GB VRAM)
RAM: 32GB

Current implementation status

FastAPI backend
- POST /audit: audits a code snippet with optional repository context (incremental RAG)
- GET /health: runtime metrics (CPU/RAM/process + GPU/VRAM when NVML is available)
Incremental indexing
- MD5 change detection
- only new/modified files are re-embedded
- manifest persisted at data/manifests/manifest.json
Vector store
- ChromaDB PersistentClient persists to disk (data/chroma/)
- runtime uses an in-memory cache (VectorCache) loaded from Chroma for low-latency retrieval
Embeddings
- sentence-transformers/all-MiniLM-L6-v2
- GPU preferred when torch.cuda.is_available()
UI
- Streamlit frontend
- shows audit report + PCA(2D) Plotly scatter
- shows live CPU/RAM/GPU usage via /health and request latency/timings

Project layout

Local-Code-Guardian/
  backend/
    app/
      main.py
      api/
        routes_audit.py
        routes_health.py
      rag/
        embedder.py
        chroma_store.py
        retriever.py
      indexing/
        incremental_indexer.py
        manifest.py
        file_hashing.py
      analysis/
        pca.py
  frontend/
    streamlit_app.py
  data/
    chroma/
    manifests/
      manifest.json
  requirements.txt

Prerequisites

Conda environment (example name: local-code-guardian)
Python 3.10
CUDA 12.1
PyTorch 2.5.1 (CUDA build) already installed in the environment
Ollama installed (Windows supported)

Install Python dependencies

In your conda env:

python -m pip install -r requirements.txt

Note: torch is intentionally NOT pinned in requirements.txt to avoid overwriting your CUDA-enabled PyTorch install.

Download the LLM model (Ollama)

ollama pull llama3:8b-instruct-q4_K_M

You can override the model name using OLLAMA_MODEL.

Run

Start backend

python -m uvicorn backend.app.main:app --host 127.0.0.1 --port 8000

Start frontend

streamlit run frontend/streamlit_app.py

Open the Streamlit URL shown in the terminal.

Usage

In the Streamlit sidebar:
- set Backend URL (default http://localhost:8000)
- optionally set Git repo path to enable incremental indexing + retrieval
- paste code and click Audit
The UI will show:
- audit report
- timings (index/retrieve/llm/total + HTTP RTT)
- PCA scatter plot of embeddings (new/updated vectors are highlighted)
- live CPU/RAM/GPU metrics (auto-refresh)

Backend endpoints

POST /audit
- body: { "code": "...", "prompt": "...", "repo_path": "...", "top_k": 5 }
- returns: { report, retrieved, points, timings }
GET /health
- returns: { status, cpu, ram, process, gpu }

Configuration

Environment variables:

OLLAMA_BASE_URL (default http://localhost:11434/api)
OLLAMA_MODEL (default llama3:8b-instruct-q4_K_M)
OLLAMA_TIMEOUT_S (default 600)
OLLAMA_NUM_GPU (optional, passed to Ollama options when set)
CHROMA_PERSIST_DIR (default data/chroma)
CHROMA_COLLECTION (default code)
MANIFEST_PATH (default data/manifests/manifest.json)
EMBEDDING_MODEL_NAME (default sentence-transformers/all-MiniLM-L6-v2)

Known limitations (current)

Chroma is used as persistent storage; retrieval is performed from an in-memory cache loaded from Chroma.
Indexing currently embeds whole files as single vectors (no chunking yet).
Ollama GPU layer control is version-dependent; OLLAMA_NUM_GPU is kept as an optional knob.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
backend		backend
data/chroma		data/chroma
frontend		frontend
.gitattributes		.gitattributes
README.fr-FR.md		README.fr-FR.md
README.md		README.md
README.zh-CN.md		README.zh-CN.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Local-Code-Guardian

Hardware target

Current implementation status

Project layout

Prerequisites

Install Python dependencies

Download the LLM model (Ollama)

Run

Start backend

Start frontend

Usage

Backend endpoints

Configuration

Known limitations (current)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Local-Code-Guardian

Hardware target

Current implementation status

Project layout

Prerequisites

Install Python dependencies

Download the LLM model (Ollama)

Run

Start backend

Start frontend

Usage

Backend endpoints

Configuration

Known limitations (current)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages