You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-[Infrastructure as Code](#infrastructure-as-code)
41
41
@@ -77,7 +77,7 @@ The core pillars of MLOps on Azure are:
77
77
> [!TIP]
78
78
> Start by assessing your current maturity level honestly. Most organizations land at Level 0 or 1. Focus on progressing one level at a time rather than trying to implement everything at once.
79
79
80
-
## Phase 1 — Problem Definition & Business Understanding
80
+
## Phase 1: Problem Definition & Business Understanding
81
81
82
82
> Before writing a single line of code, align on what success looks like. This phase is often underestimated but is the single biggest determinant of whether an ML project delivers value.
83
83
@@ -98,7 +98,7 @@ The core pillars of MLOps on Azure are:
98
98
|**Responsible AI documentation**| Establish a **Model Card** or **AI Use Case Description** early — this feeds directly into Responsible AI documentation later. |
99
99
|**Service Level Agreements**| Define **SLAs** for model latency, availability, and retraining frequency before any architecture decisions are made. |
100
100
101
-
## Phase 2 — Data Management & Preparation
101
+
## Phase 2: Data Management & Preparation
102
102
103
103
> Data is the foundation of every ML system. Azure provides a rich ecosystem for storing, versioning, and transforming data at scale.
104
104
@@ -128,7 +128,7 @@ The core pillars of MLOps on Azure are:
128
128
|**Secure storage**| Store sensitive data in **Azure Data Lake Storage Gen2** with hierarchical namespace enabled, access controlled via Azure RBAC and ACLs — never embed credentials in code. |
129
129
|**Data protection**| Enable **soft delete** and **versioning** on Azure Blob/ADLS to protect against accidental deletion. |
130
130
131
-
## Phase 3 — Model Development & Experimentation
131
+
## Phase 3: Model Development & Experimentation
132
132
133
133
> This phase is the most iterative. The goal is to rapidly explore hypotheses, track experiments, and identify the best modeling approach.
134
134
@@ -158,7 +158,7 @@ The core pillars of MLOps on Azure are:
158
158
|**Code separation**| Separate **research code** (notebooks) from **production code** (Python modules). Notebooks are great for exploration but not for pipelines. |
159
159
|**Environment management**| Register **compute environments** as Azure ML Environments so the same Docker image is used in local dev, CI, and production training. |
160
160
161
-
## Phase 4 — Model Training at Scale
161
+
## Phase 4: Model Training at Scale
162
162
163
163
> Once you have a promising approach, move from interactive notebooks to **automated, parameterized training pipelines** that can run reliably on cloud compute.
164
164
@@ -189,8 +189,7 @@ The core pillars of MLOps on Azure are:
189
189
|**Named outputs**| Store all training outputs (model artifacts, metrics, logs) as named outputs in the pipeline for automatic tracking and lineage. |
190
190
|**Pinned environments**| Pin the **Azure ML Environment** (Docker image + conda/pip dependencies) so training is fully reproducible months later. |
191
191
192
-
193
-
## Phase 5 — Model Evaluation & Validation
192
+
## Phase 5: Model Evaluation & Validation
194
193
195
194
> A model that performs well on a held-out test set is not automatically ready for production. Validation must go beyond aggregate metrics.
196
195
@@ -222,8 +221,7 @@ The core pillars of MLOps on Azure are:
222
221
|**Champion comparison**| Always compare the new model against the **currently deployed champion** on the same test dataset, not just an absolute threshold. |
223
222
|**Model tagging & lineage**| Tag every registered model with training data version, pipeline run ID, Git commit SHA, and key metrics to ensure full traceability. |
224
223
225
-
226
-
## Phase 6 — Model Deployment & Serving
224
+
## Phase 6: Model Deployment & Serving
227
225
228
226
> Getting a validated model to production in a reliable, secure, and observable way.
229
227
@@ -253,8 +251,7 @@ The core pillars of MLOps on Azure are:
253
251
|**Smoke & integration tests**| Run automated tests against the staging deployment in the CD pipeline before promoting to production. |
254
252
|**Registry-based deployments**| Reference model artifacts from the **Azure ML Model Registry** by name and version — never copy files manually. |
255
253
256
-
257
-
## Phase 7 — Monitoring & Observability
254
+
## Phase 7: Monitoring & Observability
258
255
259
256
> A deployed model is not "done." Its performance degrades over time as the real world changes. Monitoring is what keeps production models healthy.
260
257
@@ -277,8 +274,7 @@ The core pillars of MLOps on Azure are:
277
274
|**Baseline dataset**| Store a baseline dataset (training data or representative sample) at deployment time — Azure ML uses this as the reference distribution for drift calculations. |
278
275
|**Ground truth collection**| Collect and store ground truth labels wherever possible to compute actual model performance metrics in production. |
279
276
280
-
281
-
## Phase 8 — Retraining & Continuous Improvement
277
+
## Phase 8: Retraining & Continuous Improvement
282
278
283
279
> Models must evolve with the data. The goal of this phase is a closed-loop system where monitoring signals feed back into the training pipeline automatically or with minimal human intervention.
284
280
@@ -350,7 +346,6 @@ PR / Push to main
350
346
> [!NOTE]
351
347
> Use **GitHub Actions** (preferred for open source / GitHub-native projects) or **Azure DevOps Pipelines** for CI/CD. Both have first-class Azure ML integration via the `azure/ml` GitHub Actions or `AzureML` DevOps tasks.
352
348
353
-
354
349
## Infrastructure as Code
355
350
356
351
> All Azure ML infrastructure in this repository is provisioned via **Terraform**. See the [terraform-infrastructure/README.md](terraform-infrastructure/README.md) for full deployment instructions.
0 commit comments