A virtual math professor that provides step-by-step solutions to mathematical problems using an Agentic RAG (Retrieval-Augmented Generation) architecture, with human feedback for continuous improvement.
Deployed Agent at https://bcwti9cmjtjgovljrdxxer.streamlit.app/
Documentation at https://sarthakb11.github.io/math-homework-helper/
- AI Gateway with Guardrails: Ensures all interactions are focused on mathematics, maintaining safety and privacy.
- Knowledge Base Integration: Uses a vector database (Qdrant) to store and retrieve mathematical knowledge.
- Web Search Capability: Falls back to web search when the knowledge base lacks information.
- Step-by-Step Solution Generation: Provides clear, easy-to-understand solution steps.
- Human-in-the-Loop Feedback: Learns from user feedback to improve future responses.
- Interactive Web Interface: Simple, user-friendly interface using Streamlit.
- Backend: Python (LangGraph, LangChain)
- Frontend: Streamlit
- Vector Database: Qdrant
- LLM: Google AI (Gemini-Pro)
- Search API: Tavily/Serper
- Embedding: Sentence Transformers
- HITL Framework: DSPy-ai
- Other: FastAPI, MongoDB (via pymongo), dotenv
The system uses an Agentic RAG architecture with the following components:
- AI Gateway: Entry/exit point, enforcing guardrails.
- Routing Agent: Directs queries to either the knowledge base or web search.
- Knowledge Base: Vector database (Qdrant) storing math knowledge.
- Web Search Agent: Performs targeted web searches and extracts content.
- Generation Agent: Synthesizes information into step-by-step solutions.
- Human Feedback Loop: Collects and integrates user feedback.
math-homework-helper/
├── app.py # Main Streamlit app entry point
├── requirements.txt # Python dependencies
├── env.sample # Example environment variables
├── .env # (Not committed) Your actual environment variables
├── app/ # Core application modules (agents, kb, feedback, etc.)
├── scripts/
│ ├── init_db.py # Script to initialize the vector DB
│ └── load_knowledge_base.py # Script to load sample data
├── .github/workflows/
│ └── deploy.yml # GitHub Actions workflow for deployment
└── ...
- Python 3.8+
- pip
- Docker (for Qdrant)
- Git
-
Clone the repository:
git clone https://github.com/your-username/math-homework-helper.git cd math-homework-helper -
Set up a Python virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install --upgrade pip pip install -r requirements.txt
-
Configure environment variables:
cp env.sample .env
Edit
.envwith your API keys and configuration. -
Start Qdrant (Vector Database) using Docker:
docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant
-
Initialize the database:
python scripts/init_db.py
-
Load sample data into the knowledge base:
python scripts/load_knowledge_base.py
-
Run the Streamlit app:
streamlit run app.py
A ready-to-use GitHub Actions workflow is provided at .github/workflows/deploy.yml. On push to main, it will:
- Set up Python and dependencies
- Start Qdrant in Docker
- Run your Streamlit app
To use Qdrant Cloud, set VECTOR_DB_URL and VECTOR_DB_API_KEY in your .env or as GitHub Actions secrets.
Set these secrets in your GitHub repository (Settings → Secrets and variables → Actions):
LLM_API_KEYSEARCH_API_KEYVECTOR_DB_URLVECTOR_DB_PORTVECTOR_DB_COLLECTIONDB_CONNECTION_STRINGDEBUGLOG_LEVEL
(Reference your .env.sample for any additional secrets your app may require.)
See env.sample for all required environment variables:
LLM_API_KEYVECTOR_DB_URLVECTOR_DB_PORTVECTOR_DB_COLLECTIONSEARCH_API_KEYDB_CONNECTION_STRINGDEBUGLOG_LEVEL
To run tests:
pytestmaths-homework-helper/
├── app/ # Core application code
│ ├── agents/ # Agent definitions and logic
│ │ ├── generation_agent.py # LLM-based solution generation
│ │ └── routing_agent.py # KB/Web routing logic
│ ├── gateway/ # AI Gateway implementation
│ │ └── ai_gateway.py # Input/output validation
│ ├── kb/ # Knowledge Base integration
│ │ └── vector_db.py # Vector database connector
│ ├── web_search/ # Web Search and extraction logic
│ │ └── search_agent.py # Web search and content extraction
│ ├── feedback/ # Human-in-the-Loop feedback mechanism
│ │ └── feedback_loop.py # Feedback collection and processing
│ └── models/ # Data models and schemas
│ └── database.py # Database models
├── scripts/ # Utility scripts
│ ├── init_db.py # Initialize database tables
│ └── load_knowledge_base.py # Load sample data into KB
├── data/ # Knowledge Base data and schemas
│ ├── kb_data/ # Custom KB data files
│ └── feedback_logs/ # Feedback logs
├── Instructions/ # Project documentation
├── .env # Environment variables (not versioned)
├── env.sample # Sample environment variables
├── requirements.txt # Python dependencies
└── app.py # Main application entry point
- User submits a math question through the UI
- AI Gateway validates the input
- Routing Agent checks the Knowledge Base for relevant information
- If KB has good matches, the solution is generated from KB content
- If KB lacks information, Web Search is performed to find solutions
- Generation Agent creates a step-by-step solution
- User sees the solution and can provide feedback (helpful/needs improvement)
- Feedback is logged and used to improve future responses
We welcome contributions! Please see the development workflow in the documentation folder.
This project is licensed under the MIT License.
- Built using LangChain and LangGraph frameworks
- Uses Qdrant for vector storage
- Powered by Google AI's Gemini models