A recommendation system with built-in A/B testing infrastructure, statistical analysis, and KPI tracking — built from scratch in Python.
┌─────────────────────────────────────────────────────────────┐
│ FastAPI (app.py) │
└────────────┬──────────────┬───────────────┬────────────────-┘
│ │ │
┌────────▼──────┐ ┌────▼──────┐ ┌────▼──────────┐
│ recommenders │ │ ab_testing│ │ tracking │
│ ├── cf.py │ │ ├─manager │ │ interaction │
│ ├── content │ │ └─logger │ │ _tracker │
│ └── simil.. │ └───────────┘ └───────────────┘
└───────┬───────┘
│
┌───────▼───────┐ ┌──────────────┐ ┌───────────────┐
│ features/ │ │ analysis.py │ │ metrics.py │
│ item_feat.. │ │ (KPIs) │ │ (pure math) │
└───────────────┘ └──────┬───────┘ └───────────────┘
│
┌──────────▼──────────┐
│ significance.py │
│ interpretation.py │
└─────────────────────┘
Idea: users who agreed in the past will agree in the future.
- Build a weighted user-item interaction matrix (view=1, click=3, purchase=5)
- Compute pairwise cosine similarity between users
- Find the top-K nearest neighbours for the target user
- Score unseen items by
Σ similarity × neighbour_weightand return top-N - Cold-start fill: items with zero interactions from anyone are scored via content similarity to the user's liked items (discounted ×0.05) and appended after the top-N CF results — they can never be crowded out by the count limit
Strength: discovers unexpected items through shared taste communities
Weakness: cold start for new users with no interaction history at all
Idea: recommend items similar to what this user has already liked.
- Build item feature vectors: TF-IDF on description + weighted category indicator (
__cat_prefix, weight 3.0) - Pre-compute pairwise item-item cosine similarity; cache invalidates immediately when any item is added
- For the target user's liked items, look up similar items and score by
Σ item_similarity × user_interaction_weight - Cold-start fallback: users with no history receive items ranked by average similarity to all other items (most representative items in the catalog)
Strength: new items appear in results as soon as they are added; no cross-user data needed
Weakness: filter bubble — recommends more of the same
1. POST /ab_tests → create a test (control=v1, treatment=v2)
2. GET /ab_tests/{id}/assignments?user_id=X
→ deterministic hash assigns user to A or B
3. GET /recommendations/v1 or /v2
→ serve recommendations for the assigned variant
4. POST /interactions → log user actions (view / click / purchase)
5. GET /ab_tests/{id}/results
→ CTR, conversion rate, engagement time per variant
6. GET /ab_tests/{id}/statistical_analysis
→ p-values, confidence intervals, effect sizes, verdict
Assignment uses MD5("{test_id}:{user_id}") so the same user always lands in the same variant, across all servers and restarts.
| Method | Path | Description |
|---|---|---|
| POST | /interactions |
Log a user interaction |
| GET | /users/{id}/interactions |
User's interaction history |
| GET | /items/{id}/stats |
Item popularity counts |
| GET | /recommendations/v1 |
CF recommendations |
| GET | /recommendations/v2 |
Content-based recommendations |
| POST | /ab_tests |
Create a new A/B test |
| GET | /ab_tests |
List active tests |
| GET | /ab_tests/{id}/assignments |
Get or create user variant |
| GET | /ab_tests/{id}/results |
KPI metrics per variant |
| GET | /ab_tests/{id}/analysis |
Statistical analysis |
| GET | /ab_tests/{id}/metrics_over_time |
Metrics bucketed by day/hour |
| GET | /ab_tests/{id}/statistical_analysis |
Full analysis with conclusions |
| POST | /seed |
Reset DB and load all sample data |
| POST | /seed/clear |
Remove all data from the database |
A React + Vite dashboard that shows the system live — built for recruiters who want to see it working without reading API docs.
| Tab | What it shows |
|---|---|
| Overview | Architecture diagram, tech stack, A/B workflow steps. Load sample data / Clear data buttons reset or wipe the database instantly — all other pages refresh automatically. |
| Recommendations | Pick any user → see CF (v1) and Content-Based (v2) results side by side, log interactions, add items |
| A/B Tests | Live CTR/conversion metrics, bar chart comparison, 7-day CTR trend, statistical significance verdict |
The app starts with an empty database. Use Load sample data in the Overview tab to populate 50 users, 20 items, and a 7-day A/B test with realistic event history for testing.
docker compose up --buildStarts API + Redis + frontend. The database starts empty — use the Load sample data button in the Overview tab to populate demo data.
| Service | URL |
|---|---|
| Frontend | http://localhost:5173 |
| API | http://localhost:8000 |
| API docs | http://localhost:8000/docs |
docker compose down # stop (data persists in ./smart_suggest.db)
docker compose down -v # stop + wipe volumes# 1. Backend
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
uvicorn app:app --reload
# → http://localhost:8000
# 2. Frontend (separate terminal)
cd frontend
npm install
npm run dev
# → http://localhost:5173Then open the app and click Load sample data in the Overview tab.
pytest # all tests
pytest tests/test_integration.py # integration only
pytest -v --tb=short # verbose output
pytest --cov=. --cov-report=term-missing # with coveragesmart-suggest/
├── app.py ← FastAPI entry point + REST routes
├── config.py ← categories, strategy configs, DB URL (reads DATABASE_URL env)
├── models.py ← SQLAlchemy ORM models
├── seed.py ← sample data loader (called by POST /seed)
├── entrypoint.sh ← Docker startup script
├── cache.py ← in-memory TTL cache (Redis-ready interface)
├── metrics.py ← pure KPI functions (CTR, conversion, diversity)
├── analysis.py ← DB-backed metric aggregation
├── significance.py ← chi-square, Welch's t-test, Wilson CI, power analysis
├── interpretation.py ← Cohen's h, practical significance, conclusions
│
├── recommenders/
│ ├── cf.py ← collaborative filtering + cold-item fallback
│ ├── content.py ← content-based filtering + cold-user fallback
│ └── similarity.py ← cosine similarity, Pearson, top-K neighbours
│
├── features/
│ └── item_features.py ← TF-IDF, item-item similarity matrix, TTL cache
│
├── tracking/
│ └── interaction_tracker.py ← log/query user interactions
│
├── ab_testing/
│ ├── manager.py ← create tests, deterministic variant assignment
│ └── logger.py ← impression / click / purchase / engagement events
│
├── frontend/ ← React + Vite demo dashboard
│ ├── src/
│ │ ├── pages/
│ │ │ ├── OverviewPage.jsx ← architecture diagram, tech stack, seed controls
│ │ │ ├── RecommendationsPage.jsx ← CF vs content-based side by side
│ │ │ └── ABTestsPage.jsx ← live metrics, charts, significance
│ │ └── api.js ← typed API client (proxies to FastAPI)
│ └── vite.config.js ← dev proxy → API_URL env var
│
└── tests/
├── test_tracking.py
├── test_cf.py
├── test_content.py
├── test_ab_assignment.py
├── test_metrics.py
├── test_significance.py
└── test_integration.py
| Layer | Technology |
|---|---|
| API | FastAPI + Uvicorn |
| ORM | SQLAlchemy 2.0 |
| Database | SQLite (dev) / PostgreSQL (prod) |
| Cache | In-memory TTL cache (Redis-ready) |
| Stats | Pure Python (no scipy/numpy required) |
| Deploy | Docker + AWS ECS Fargate |