Duque Ortega Mutis DuqueOM

Duque Ortega Mutis — MLOps Engineer (Production-Focused)

I don't just deploy ML models. I diagnose why they break at 2am.

MLOps engineer with a production multi-cloud platform deployed from scratch (GKE + EKS, 6 K8s services, 395+ tests). 14 years running 5 businesses — managing teams, P&L, and vendor operations — now applied to building reliable ML systems.

Three production incidents — diagnosed from first principles, not guesswork

Three production incidents diagnosed from first principles:

 81% error rate under load  →  uvicorn --workers is anti-pattern under K8s
                                (shared CPU budget = thrashing, not parallelism)
                                Fixed: asyncio + ThreadPoolExecutor, GIL analysis
                                Result: 81% errors → 0%, 2000m CPU → 1000m

 SHAP returning all zeros   →  TreeExplainer incompatible with StackingClassifier
                                Fixed: KernelExplainer in original feature space
                                Evaluated 4 alternatives before deciding

 HPA never scales down      →  Memory-based HPA + fixed ML footprint
                                = mathematically impossible to scale down
                                Fixed: CPU-only HPA, 3→1 pods in 8 minutes

Flagship Open-Source — ML-MLOps-Production-Template

The patterns my portfolio cost $200/mo and 22 ADRs to learn — packaged so other teams don't have to.

Most templates give you files.
This one gives you a behavioral protocol.
 
AUTO / CONSULT / STOP — 20 operations mapped to agent modes.
STOP on production deploys cannot be bypassed by human insistence.
If env=production and audit.passed=False → DeploymentRequest refuses to construct.
 
The invariants aren't in the README. They're in the code.

Layer	What's encoded
32 anti-patterns (D-01→D-32)	Runtime · Training · Infrastructure · EDA · Security · Closed-loop monitoring
SLSA L2 supply chain	Gitleaks → Trivy → Syft SBOM → Cosign keyless (OIDC) → Kyverno admission
Closed-loop monitoring	Ground truth ingestion · Sliced performance · Champion/Challenger (McNemar + bootstrap ΔAUC)
Quad-IDE native	Windsurf · Claude Code · Cursor · Codex — same invariants, native config for each
24 ADRs	Each decision documented with alternatives rejected and revisit triggers

# Zero to working fraud detection service in one command
git clone https://github.com/DuqueOM/ML-MLOps-Production-Template.git
cd ML-MLOps-Production-Template && make bootstrap

→ Template repo | QUICK_START.md | 24 ADRs

Production Portfolio — ML-MLOps-Portfolio

3 ML services on GKE + EKS · 18 ADRs · 395+ tests · Multi-cloud Terraform

Most ML portfolios show models that score well. This one shows what happens after you deploy — the incidents, the wrong decisions corrected, and 18 Architectural Decision Records documenting every trade-off with measured data.

Project	Metric	Key Engineering Decision
BankChurn Predictor	AUC 0.87 · 90% cov	Async inference via ThreadPoolExecutor · threshold 0.35 (30:1 cost ratio, quantified)
NLPInsight Analyzer	Acc 80.6% · 98% cov	Upgraded from curated dataset to 11.9K real noisy tweets — honest over impressive
ChicagoTaxi Pipeline	R² 0.96 · 6.3M rows	Found & fixed data leakage · temporal split · R² 0.905→0.965

Selected "Don't Build" decisions (often harder than building):

Removed CarVision: MAPE 32.9% is not defensible — ADR-009
Deferred Feature Store: full Feast architecture designed for when it's needed — ADR-007
Rejected Airflow: CronJob + GitHub Actions is sufficient for 3 models — ADR-006
Documented $24/mo GCP vs $145/mo AWS gap — both meet SLA, chose FinOps — ADR-016

📐 18 ADRs → | 📋 Engineering Highlights → | 📺 3min Demo →

Agentic Development Configuration

The portfolio includes a production-grade agentic setup (AGENTS.md + .windsurf/) that encodes 18 ADRs and 3 production incidents into the development environment itself.

.windsurf/
├── rules/       7 context-aware rules (glob-triggered per file type)
├── skills/      6 operational procedures (debug, deploy-gke, deploy-aws,
│                drift-detection, model-retrain, release-checklist)
└── workflows/   6 structured prompt workflows (/incident, /retrain,
                 /release, /load-test, /new-adr, /drift-check)

The agent knows: never use uvicorn --workers N under K8s (ADR-014), always use KernelExplainer for SHAP with StackingClassifier (ADR-010), CPU targets are 50%/60%/60% — not 70% (ADR-001). Operational knowledge encoded, not just referenced.

→ AGENTS.md | .windsurf/

Stack

Kubernetes (GKE · EKS) Terraform GitHub Actions FastAPI MLflow Prometheus Grafana Argo Rollouts Docker PySpark LightGBM XGBoost SHAP Evidently DVC Pandera GCP AWS SageMaker Vertex AI Cosign Kyverno OpenTelemetry Python 3.11+

AWS Certified Machine Learning Engineer – Associate (MLA-C01) · TripleTen Data Science · 14 years ops → MLOps

AI Transparency

These projects use AI-assisted tools (Windsurf Cascade, Claude Code) for code generation and boilerplate. All architectural decisions, system design, trade-off analysis, and incident resolution are the author's. AI tools accelerate throughput — they don't replace engineering judgment.

The .windsurf/ and .claude/ configurations are themselves a demonstration of this philosophy: the agent is governed by documented decisions, not given free rein. That governance design is original and independently authored.

Open to MLOps · ML Platform · ML Infrastructure roles — Remote preferred — Mexico City (CST)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Duque Ortega Mutis DuqueOM

Achievements