I don't just deploy ML models. I diagnose why they break at 2am.
MLOps engineer with a production multi-cloud platform deployed from scratch (GKE + EKS, 6 K8s services, 395+ tests). 14 years running 5 businesses — managing teams, P&L, and vendor operations — now applied to building reliable ML systems.
Three production incidents diagnosed from first principles:
81% error rate under load → uvicorn --workers is anti-pattern under K8s
(shared CPU budget = thrashing, not parallelism)
Fixed: asyncio + ThreadPoolExecutor, GIL analysis
Result: 81% errors → 0%, 2000m CPU → 1000m
SHAP returning all zeros → TreeExplainer incompatible with StackingClassifier
Fixed: KernelExplainer in original feature space
Evaluated 4 alternatives before deciding
HPA never scales down → Memory-based HPA + fixed ML footprint
= mathematically impossible to scale down
Fixed: CPU-only HPA, 3→1 pods in 8 minutes
Flagship Open-Source — ML-MLOps-Production-Template
The patterns my portfolio cost $200/mo and 22 ADRs to learn — packaged so other teams don't have to.
Most templates give you files.
This one gives you a behavioral protocol.
AUTO / CONSULT / STOP — 20 operations mapped to agent modes.
STOP on production deploys cannot be bypassed by human insistence.
If env=production and audit.passed=False → DeploymentRequest refuses to construct.
The invariants aren't in the README. They're in the code.
| Layer | What's encoded |
|---|---|
| 32 anti-patterns (D-01→D-32) | Runtime · Training · Infrastructure · EDA · Security · Closed-loop monitoring |
| SLSA L2 supply chain | Gitleaks → Trivy → Syft SBOM → Cosign keyless (OIDC) → Kyverno admission |
| Closed-loop monitoring | Ground truth ingestion · Sliced performance · Champion/Challenger (McNemar + bootstrap ΔAUC) |
| Quad-IDE native | Windsurf · Claude Code · Cursor · Codex — same invariants, native config for each |
| 24 ADRs | Each decision documented with alternatives rejected and revisit triggers |
# Zero to working fraud detection service in one command
git clone https://github.com/DuqueOM/ML-MLOps-Production-Template.git
cd ML-MLOps-Production-Template && make bootstrap
→ Template repo | QUICK_START.md | 24 ADRs
Production Portfolio — ML-MLOps-Portfolio
3 ML services on GKE + EKS · 18 ADRs · 395+ tests · Multi-cloud Terraform
Most ML portfolios show models that score well. This one shows what happens after you deploy — the incidents, the wrong decisions corrected, and 18 Architectural Decision Records documenting every trade-off with measured data.
| Project | Metric | Key Engineering Decision |
|---|---|---|
| BankChurn Predictor | AUC 0.87 · 90% cov | Async inference via ThreadPoolExecutor · threshold 0.35 (30:1 cost ratio, quantified) |
| NLPInsight Analyzer | Acc 80.6% · 98% cov | Upgraded from curated dataset to 11.9K real noisy tweets — honest over impressive |
| ChicagoTaxi Pipeline | R² 0.96 · 6.3M rows | Found & fixed data leakage · temporal split · R² 0.905→0.965 |
Selected "Don't Build" decisions (often harder than building):
- Removed CarVision: MAPE 32.9% is not defensible — ADR-009
- Deferred Feature Store: full Feast architecture designed for when it's needed — ADR-007
- Rejected Airflow: CronJob + GitHub Actions is sufficient for 3 models — ADR-006
- Documented $24/mo GCP vs $145/mo AWS gap — both meet SLA, chose FinOps — ADR-016
📐 18 ADRs → | 📋 Engineering Highlights → | 📺 3min Demo →
The portfolio includes a production-grade agentic setup (AGENTS.md + .windsurf/) that encodes 18 ADRs and 3 production incidents into the development environment itself.
.windsurf/
├── rules/ 7 context-aware rules (glob-triggered per file type)
├── skills/ 6 operational procedures (debug, deploy-gke, deploy-aws,
│ drift-detection, model-retrain, release-checklist)
└── workflows/ 6 structured prompt workflows (/incident, /retrain,
/release, /load-test, /new-adr, /drift-check)
The agent knows: never use uvicorn --workers N under K8s (ADR-014), always use KernelExplainer for SHAP with StackingClassifier (ADR-010), CPU targets are 50%/60%/60% — not 70% (ADR-001). Operational knowledge encoded, not just referenced.
→ AGENTS.md | .windsurf/
Kubernetes (GKE · EKS) Terraform GitHub Actions FastAPI MLflow Prometheus Grafana
Argo Rollouts Docker PySpark LightGBM XGBoost SHAP Evidently DVC Pandera
GCP AWS SageMaker Vertex AI Cosign Kyverno OpenTelemetry Python 3.11+
AWS Certified Machine Learning Engineer – Associate (MLA-C01) · TripleTen Data Science · 14 years ops → MLOps
These projects use AI-assisted tools (Windsurf Cascade, Claude Code) for code generation and boilerplate. All architectural decisions, system design, trade-off analysis, and incident resolution are the author's. AI tools accelerate throughput — they don't replace engineering judgment.
The .windsurf/ and .claude/ configurations are themselves a demonstration of this philosophy:
the agent is governed by documented decisions, not given free rein.
That governance design is original and independently authored.
Open to MLOps · ML Platform · ML Infrastructure roles — Remote preferred — Mexico City (CST)

