A reusable, configuration-driven framework for building, deploying, monitoring, and maintaining machine learning projects in production.
This repository is an internal framework template for ML engineering teams. It encodes the hard-won decisions that every production ML project faces — how to structure code, how to version models, how to detect drift, how to serve predictions — so teams can focus on the actual machine learning problem rather than re-building the same scaffolding each time.
It is deliberately minimal and readable. Every file has a clear job. No magic. No unnecessary abstractions.
Request / Config
│
▼
┌─────────────────────────────────────────────────┐
│ interfaces/ CLI · FastAPI │ Entry points
├─────────────────────────────────────────────────┤
│ pipelines/ TrainingPipeline │ Orchestration
├─────────────────────────────────────────────────┤
│ application/ FeatureEngineer · InferenceService│ Use cases
├─────────────────────────────────────────────────┤
│ infrastructure/ Models · Registry · Tracking │ Concrete implementations
├─────────────────────────────────────────────────┤
│ domain/ BaseModel · DataValidator · Drift │ Core abstractions
├─────────────────────────────────────────────────┤
│ shared/ Config · Logging │ Cross-cutting concerns
└─────────────────────────────────────────────────┘
Layer responsibilities:
| Layer | Responsibility |
|---|---|
domain/ |
Pure abstractions — no framework dependencies. BaseModel, validation rules, drift metrics. |
application/ |
Use cases — FeatureEngineer, InferenceService. Orchestrates domain objects. |
infrastructure/ |
Concrete implementations — XGBoost, LocalRegistry, MLflow tracker. |
interfaces/ |
Entry points — FastAPI routes, Click CLI. Converts external input to domain calls. |
pipelines/ |
End-to-end workflow coordination. Calls the right components in the right order. |
shared/ |
Config loader and structured logger. Used by every other layer. |
Configuration over code. Switching models, data sources, or evaluation metrics should require a config change — not a code change.
Documentation first. Every module starts with a docstring explaining why it exists, not just what it does.
Small, complete files. Each file does one thing and does it fully. No stubs, no TODOs, no "extend as needed".
Testable by default. Every component accepts its dependencies as constructor arguments, making unit tests trivial.
# 1. Install dependencies
pip install -r requirements.txt
# 2. Generate example data
python examples/classification/generate_data.py
# 3. Train
python main.py train --config config/project.yaml
# 4. Inspect registered versions
python main.py report --config config/project.yaml
# 5. Batch inference
python main.py predict --config config/project.yaml \
--input examples/classification/data/churn.csv
# 6. Start the API
uvicorn src.interfaces.api.app:app --reloadmlops-framework/
├── main.py # CLI entry point
├── requirements.txt
├── Makefile # make train / predict / api / test
├── Dockerfile
├── docker-compose.yml
├── pyproject.toml
├── .env.example
│
├── config/
│ ├── project.yaml # Classification / regression config
│ └── forecasting.yaml # Time-series config
│
├── src/
│ ├── domain/ # Core abstractions (no external deps)
│ │ ├── data/ # DataLoader, DataValidator
│ │ ├── model/ # BaseModel ABC
│ │ └── monitoring/ # DriftDetector, DriftReport
│ │
│ ├── application/ # Use cases
│ │ ├── training/ # FeatureEngineer
│ │ └── inference/ # InferenceService, PredictionResult
│ │
│ ├── infrastructure/ # Concrete implementations
│ │ ├── models/ # XGBoostModel, LightGBMModel, RandomForestModel
│ │ ├── registry/ # LocalModelRegistry
│ │ └── tracking/ # ExperimentTracker (MLflow / local)
│ │
│ ├── interfaces/
│ │ ├── api/ # FastAPI app
│ │ └── cli/ # Click CLI
│ │
│ ├── pipelines/
│ │ └── training_pipeline.py # End-to-end training orchestrator
│ │
│ └── shared/
│ ├── config/ # YAML loader → ProjectConfig
│ └── logging/ # structlog setup
│
├── examples/
│ ├── classification/ # Customer churn walkthrough
│ ├── forecasting/ # Sales forecast with Prophet
│ └── anomaly_detection/ # Isolation Forest example
│
├── tests/
│ ├── test_config.py
│ ├── test_data.py
│ ├── test_models.py
│ ├── test_drift.py
│ └── test_inference.py
│
└── docs/
├── architecture.md
├── configuration.md
├── training_pipeline.md
├── inference_pipeline.md
├── monitoring.md
└── deployment.md
make install # pip install -r requirements.txt
make train # Run training pipeline
make api # Start FastAPI server
make test # Run pytest suite
python main.py train --config config/project.yaml
python main.py predict --config config/project.yaml --input data.csv
python main.py monitor --config config/project.yaml --reference ref.csv --current cur.csv
python main.py report --config config/project.yamlStart the server with make api, then:
| Endpoint | Method | Description |
|---|---|---|
/health |
GET | Liveness probe |
/metrics |
GET | Runtime metrics |
/predict |
POST | Real-time inference |
/batch_predict |
POST | Batch inference |
Example request:
curl -X POST http://localhost:8000/predict \
-H "Content-Type: application/json" \
-d '{"model_name": "customer_churn", "data": [{"tenure": 5, "monthly_charges": 80, "total_charges": 400, "num_products": 2, "support_calls": 3}]}'| Config value | Class | Use case |
|---|---|---|
xgboost |
XGBoostModel |
Classification, Regression |
lightgbm |
LightGBMModel |
Classification, Regression |
random_forest |
RandomForestModel |
Classification, Regression |
| Prophet | run_forecast.py |
Time-series forecasting |
| IsolationForest | run_anomaly_detection.py |
Anomaly detection |
pip install pytest
pytest tests/ -vAll 5 test modules cover: config loading, data validation, model training, drift detection, and inference.
- Create
src/infrastructure/models/my_model.pyinheriting fromBaseModel. - Implement
train(),predict(),get_params(). - Register it in
src/pipelines/training_pipeline.pyMODEL_MAP. - Add
type: my_modelto your config YAML.
That's it. The pipeline, registry, and API all work automatically.