Skip to content

joker4002/LocalCodeGuardian

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Local-Code-Guardian

English | 简体中文 | Français

A local, offline-first code auditing system with incremental indexing (MD5-based) and in-memory retrieval, designed for consumer GPUs.

Hardware target

  • GPU: NVIDIA RTX 3060 Laptop (6GB VRAM)
  • RAM: 32GB

Current implementation status

  • FastAPI backend
    • POST /audit: audits a code snippet with optional repository context (incremental RAG)
    • GET /health: runtime metrics (CPU/RAM/process + GPU/VRAM when NVML is available)
  • Incremental indexing
    • MD5 change detection
    • only new/modified files are re-embedded
    • manifest persisted at data/manifests/manifest.json
  • Vector store
    • ChromaDB PersistentClient persists to disk (data/chroma/)
    • runtime uses an in-memory cache (VectorCache) loaded from Chroma for low-latency retrieval
  • Embeddings
    • sentence-transformers/all-MiniLM-L6-v2
    • GPU preferred when torch.cuda.is_available()
  • UI
    • Streamlit frontend
    • shows audit report + PCA(2D) Plotly scatter
    • shows live CPU/RAM/GPU usage via /health and request latency/timings

Project layout

Local-Code-Guardian/
  backend/
    app/
      main.py
      api/
        routes_audit.py
        routes_health.py
      rag/
        embedder.py
        chroma_store.py
        retriever.py
      indexing/
        incremental_indexer.py
        manifest.py
        file_hashing.py
      analysis/
        pca.py
  frontend/
    streamlit_app.py
  data/
    chroma/
    manifests/
      manifest.json
  requirements.txt

Prerequisites

  • Conda environment (example name: local-code-guardian)
  • Python 3.10
  • CUDA 12.1
  • PyTorch 2.5.1 (CUDA build) already installed in the environment
  • Ollama installed (Windows supported)

Install Python dependencies

In your conda env:

python -m pip install -r requirements.txt

Note: torch is intentionally NOT pinned in requirements.txt to avoid overwriting your CUDA-enabled PyTorch install.

Download the LLM model (Ollama)

ollama pull llama3:8b-instruct-q4_K_M

You can override the model name using OLLAMA_MODEL.

Run

Start backend

python -m uvicorn backend.app.main:app --host 127.0.0.1 --port 8000

Start frontend

streamlit run frontend/streamlit_app.py

Open the Streamlit URL shown in the terminal.

Usage

  • In the Streamlit sidebar:
    • set Backend URL (default http://localhost:8000)
    • optionally set Git repo path to enable incremental indexing + retrieval
    • paste code and click Audit
  • The UI will show:
    • audit report
    • timings (index/retrieve/llm/total + HTTP RTT)
    • PCA scatter plot of embeddings (new/updated vectors are highlighted)
    • live CPU/RAM/GPU metrics (auto-refresh)

Backend endpoints

  • POST /audit
    • body: { "code": "...", "prompt": "...", "repo_path": "...", "top_k": 5 }
    • returns: { report, retrieved, points, timings }
  • GET /health
    • returns: { status, cpu, ram, process, gpu }

Configuration

Environment variables:

  • OLLAMA_BASE_URL (default http://localhost:11434/api)
  • OLLAMA_MODEL (default llama3:8b-instruct-q4_K_M)
  • OLLAMA_TIMEOUT_S (default 600)
  • OLLAMA_NUM_GPU (optional, passed to Ollama options when set)
  • CHROMA_PERSIST_DIR (default data/chroma)
  • CHROMA_COLLECTION (default code)
  • MANIFEST_PATH (default data/manifests/manifest.json)
  • EMBEDDING_MODEL_NAME (default sentence-transformers/all-MiniLM-L6-v2)

Known limitations (current)

  • Chroma is used as persistent storage; retrieval is performed from an in-memory cache loaded from Chroma.
  • Indexing currently embeds whole files as single vectors (no chunking yet).
  • Ollama GPU layer control is version-dependent; OLLAMA_NUM_GPU is kept as an optional knob.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages