Skip to content

pharmanet-org/chatbot

Repository files navigation

PharmaNet Support Chatbot

A fully local RAG (Retrieval-Augmented Generation) chatbot that answers questions from PharmaNet's documentation. No OpenAI, no LangChain, no API keys required — uses ChromaDB with built-in ONNX embeddings (all-MiniLM-L6-v2, 384-dim).

Tech Stack

Technology Version
Python 3.11-slim (Docker)
FastAPI Latest (from requirements.txt)
Uvicorn Latest with standard extras
Vector Database ChromaDB (built-in ONNX embeddings)
Embedding Model all-MiniLM-L6-v2 (384-dim, local, no API key)
Framework None (no LangChain, no LlamaIndex — bare ChromaDB)
Testing pytest, pytest-asyncio, pytest-cov, httpx

Architecture

┌──────────────┐     ┌──────────────────┐     ┌──────────────────────┐
│  FastAPI app  │────▶│  RAGEngine       │────▶│  ChromaDB            │
│  src/main.py  │◀────│  src/rag.py      │◀────│  (persistent, local) │
└──────────────┘     └──────────────────┘     └──────────────────────┘
                            │                           │
                            ▼                           ▼
                   knowledge_base.json         all-MiniLM-L6-v2
                   (107 chunks, 16 docs)       (ONNX, 384-dim)

Project Structure

chatbot/
├── src/
│   ├── main.py           # FastAPI app — 6 endpoints
│   ├── rag.py            # RAGEngine — query, load, and retrieval logic
│   └── embeddings.py     # ChromaDB persistent client setup
│
├── scripts/
│   ├── ingest_docs.py    # Parse .mdx files → JSON knowledge base
│   ├── ingest_docs.sh    # Shell wrapper for ingestion
│   ├── test_chatbot.py   # API smoke test script
│   └── test_chatbot.sh   # Shell wrapper for smoke tests
│
├── tests/
│   ├── __init__.py
│   └── test_rag.py       # pytest unit tests for RAGEngine
│
├── knowledge_base.json   # Pre-built: 107 chunks from 16 MDX docs
├── requirements.txt
├── Dockerfile
├── docker-compose.yml
└── .env.example

Features

  • RAG Question Answering — Ask questions about PharmaNet, get answers with source citations
  • Fully Local — No external API calls, no internet required, everything runs on-device
  • Persistent Vector Store — ChromaDB persists to disk; knowledge survives restarts
  • Document Management — Add individual documents or reload the entire knowledge base at runtime
  • Health Monitoring — Endpoint returns status, engine type, and document count
  • Knowledge Base Ingestion — Script to parse Mintlify MDX docs into the knowledge base
  • Configurable Chunking — Adjustable chunk size and overlap for document splitting

API Endpoints

Method Path Description
GET / Root info
GET /health Health check + document count
POST /query Ask a question (returns answer + sources)
POST /documents Add a single document to the KB
POST /reload Reload knowledge base from JSON
GET /stats Stats + engine information

/query

// Request
{ "question": "How do sellers register?", "max_tokens": 500 }

// Response
{
  "answer": "**Seller Registration**\n\nSteps to register...\n\n---\n\n**Documents Required**...",
  "sources": ["pharmacy/registration.mdx", "authentication.mdx"]
}

Allowed Users

User Type Access
Anyone All endpoints are public (no auth)
Internal apps Integrated via the Flutter mobile app's chatbot_api.dart

Hardcoded Credentials (Test Purposes Only)

No hardcoded secrets. The chatbot uses purely local embeddings and requires no API keys. All configuration is via environment variables (see .env.example).

Prerequisites

  • Python 3.11+
  • pip

How to Run

Without Docker

# 1. Navigate to the project
cd chatbot

# 2. Create environment file
cp .env.example .env

# 3. Install dependencies
pip install -r requirements.txt

# 4. Start the server
uvicorn src.main:app --reload

The API will be available at http://localhost:8000.

With Docker

cd chatbot
docker compose up --build

Available Commands

# Run tests
pytest tests/ -v

# Run with coverage
pytest --cov=src tests/ -v

# Health check
curl http://localhost:8000/health

# Ask a question
curl -X POST http://localhost:8000/query \
  -H "Content-Type: application/json" \
  -d '{"question": "How do I create an account?"}'

# Rebuild knowledge base from guide docs
bash scripts/ingest_docs.sh --docs-dir /path/to/pharmanet-guide-docs --verbose

# API smoke test
bash scripts/test_chatbot.sh

Configuration

Environment Variables (.env)

Variable Default Description
CHUNK_SIZE 500 Character chunk size for document splitting
CHUNK_OVERLAP 50 Chunk overlap in characters
KNOWLEDGE_BASE_PATH ./knowledge_base.json Path to the KB JSON file
CHROMA_PERSIST_DIR ./chroma_db ChromaDB persistent storage directory

Knowledge Base

The pre-built knowledge_base.json contains 107 chunks extracted from 16 MDX documents from pharmanet-guide-docs/. To rebuild:

bash scripts/ingest_docs.sh --docs-dir /path/to/pharmanet-guide-docs --verbose

License

Proprietary — PharmaNet, Alyah Software © 2026

About

OpenAI chatbot API endpoint

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors