You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
> Ensure your pipelines are well-organized and easy to understand.
71
63
@@ -78,7 +70,7 @@ Last updated: 2025-04-21
78
70
|**Organized Layout**| Arrange activities in a logical sequence and avoid overlapping lines. | - Place activities in a left-to-right or top-to-bottom flow to visually represent the data flow. <br/> - Group related activities together and use containers for better organization. |
79
71
|**Error Handling and Logging**| Include error handling and logging activities to capture and manage errors. | - Add a Web Activity to log errors to a monitoring system. <br/> - Use Try-Catch blocks to handle errors gracefully and ensure the pipeline continues running. |
> Use parameters to make your pipelines more flexible and easier to manage.
141
133
142
134
|**Best Practice**|**Description**|**Example**|
@@ -146,8 +138,8 @@ graph TD
146
138
|**Global Parameters**| Use global parameters for values that are used across multiple pipelines. | - Define a global parameter for a storage account name used in various pipelines. <br/> - Create a global parameter for a common API key used across multiple pipelines. <br/> - Use a global parameter for a base URL that is referenced in multiple activities. |
147
139
|**Parameterize Datasets**| Parameterize datasets to handle different data sources or destinations. | - Create a dataset with a parameterized file path to handle different file names dynamically. <br/> - Use parameters in datasets to switch between different databases or tables. <br/> - Define parameters for connection strings to dynamically connect to different data sources. |
148
140
149
-
###Incremental Loading
150
-
>
141
+
## Incremental Loading
142
+
151
143
> Implement incremental data loading to improve efficiency.
152
144
153
145
|**Best Practice**|**Description**|**Example**|
@@ -157,7 +149,7 @@ graph TD
157
149
|**Delta Loads**| Perform delta loads to update only the changed data instead of full loads. | - Use a query to fetch only the rows that have changed since the last load. <br/> - Implement a mechanism to track changes, such as a version number or a change flag. |
158
150
|**Partitioning**| Partition large datasets to improve performance and manageability. | - Partition data by date or another logical key to facilitate incremental loading. <br/> - Use partitioned tables in your data warehouse to improve query performance and manageability. |
159
151
160
-
####Use Timestamps
152
+
### Use Timestamps
161
153
162
154
> Implement incremental loading using timestamps to load only new or changed data.
163
155
@@ -175,8 +167,8 @@ graph TD
175
167
- After loading the data, update the watermark table with the latest timestamp.
176
168
- Use a Stored Procedure activity to update the `LastLoadedTimestamp` in the watermark table.
177
169
178
-
####Change Data Capture (CDC)
179
-
>
170
+
### Change Data Capture (CDC)
171
+
180
172
> Utilize CDC to capture and load only the changes made to the source data.
181
173
182
174
1.**Enable CDC on Source Table**:
@@ -189,8 +181,8 @@ graph TD
189
181
- Use a ForEach activity to process each change.
190
182
- Inside the ForEach activity, use Copy Data activities to apply the changes to the destination.
191
183
192
-
####Delta Loads
193
-
>
184
+
### Delta Loads
185
+
194
186
> Perform delta loads to update only the changed data instead of full loads.
195
187
196
188
1.**Track Changes**:
@@ -203,8 +195,8 @@ graph TD
203
195
- Use a Copy Data activity to load only the changed data.
204
196
- After loading, reset the `ChangeFlag` to 0.
205
197
206
-
####Partitioning
207
-
>
198
+
### Partitioning
199
+
208
200
> Partition large datasets to improve performance and manageability.
209
201
210
202
1.**Partition Your Data**:
@@ -217,8 +209,8 @@ graph TD
217
209
- Use a ForEach activity to process each partition.
218
210
- Inside the ForEach activity, use a Copy Data activity to load data for each partition.
219
211
220
-
###Error Handling and Monitoring
221
-
>
212
+
## Error Handling and Monitoring
213
+
222
214
> Set up robust error handling and monitoring to quickly identify and resolve issues.
223
215
224
216
|**Best Practice**|**Description**|**Example**|
@@ -228,7 +220,7 @@ graph TD
228
220
|**Alerts and Notifications**| Set up alerts and notifications to monitor pipeline runs and failures. | - Use Azure Monitor to create alerts for failed pipeline runs and send email notifications. <br/> - Configure alerts to trigger SMS notifications for critical pipeline failures. <br/> - Set up a Logic App to send Slack notifications when a pipeline fails. |
229
221
|**Custom Logging**| Implement custom logging to capture detailed error information. | - Use a Web Activity to log errors to an external logging service or database. <br/> - Implement an Azure Function to log detailed error information and call it from the pipeline. <br/> - Use a Set Variable activity to capture error details and write them to a log file in Azure Blob Storage. |
230
222
231
-
####a. **Use If Condition Activity**
223
+
### a. **Use If Condition Activity**
232
224
233
225
1.**Create a Pipeline**:
234
226
- Open Microsoft Fabric and navigate to Azure Data Factory.
> Apply security best practices to protect your data.
293
285
294
286
|**Best Practice**|**Description**|**Example**|
@@ -298,8 +290,8 @@ graph TD
298
290
|**Network Security**| Use virtual networks and private endpoints to secure data access. | - Configure ADF to use a private endpoint for accessing data in a storage account. <br/> - Set up a virtual network (VNet) to isolate and secure ADF resources. <br/> - Use Network Security Groups (NSGs) to control inbound and outbound traffic to ADF. |
299
291
|**Audit Logs**| Enable auditing to track access and changes to ADF resources. | - Use Azure Monitor to collect and analyze audit logs for ADF activities. <br/> - Enable diagnostic settings to send logs to Azure Log Analytics, Event Hubs, or a storage account. <br/> - Regularly review audit logs to detect and respond to unauthorized access or changes. |
300
292
301
-
###Use Azure Key Vault
302
-
>
293
+
## Use Azure Key Vault
294
+
303
295
> Store sensitive information such as connection strings, passwords, and API keys in Azure Key Vault to enhance security and manage secrets efficiently.
304
296
305
297
|**Best Practice**|**Description**|**Example**|
@@ -309,7 +301,7 @@ graph TD
309
301
|**Secure Access**| Use managed identities to securely access Key Vault secrets. | - Configure ADF to use its managed identity to retrieve secrets from Key Vault. <br/> - Enable managed identity for ADF and grant it access to Key Vault secrets. <br/> - Use managed identities to avoid storing credentials in code or configuration files. |
310
302
|**Rotate Secrets**| Regularly rotate secrets to enhance security. | - Update secrets in Key Vault periodically and update references in ADF. <br/> - Implement a process to rotate secrets automatically using Azure Automation or Logic Apps. <br/> - Notify relevant teams when secrets are rotated to ensure they update their configurations. |
311
303
312
-
####Store Secrets
304
+
### Store Secrets
313
305
314
306
> Store sensitive information such as connection strings, passwords, and API keys in Key Vault.
> Configure access policies to control who can access secrets.
339
331
340
332
1.**Set Up Access Policies in Key Vault**:
@@ -346,7 +338,7 @@ graph TD
346
338
- Define access policies to allow only specific users or applications to retrieve secrets.
347
339
- Example: Grant access to specific roles such as `DataFactoryContributor` for managing secrets.
348
340
349
-
####Secure Access
341
+
### Secure Access
350
342
351
343
> Use managed identities to securely access Key Vault secrets.
352
344
@@ -355,8 +347,8 @@ graph TD
355
347
- In the Key Vault, add an access policy to grant the Data Factory managed identity access to the required secrets.
356
348
- Example: Grant `Get` and `List` permissions to the managed identity.
357
349
358
-
####Rotate Secrets
359
-
>
350
+
### Rotate Secrets
351
+
360
352
> Regularly rotate secrets to enhance security.
361
353
362
354
1.**Update Secrets in Key Vault**:
@@ -369,10 +361,9 @@ graph TD
369
361
- Ensure that relevant teams are notified when secrets are rotated.
370
362
- Example: Use Logic Apps to send email notifications when secrets are updated.
371
363
372
-
###Source Control
364
+
## Source Control
373
365
374
366
> Benefits of Git Integration: <br/>
375
-
>
376
367
> -**Version Control**: Track and audit changes, and revert to previous versions if needed. <br/>
377
368
> -**Collaboration**: Multiple team members can work on the same project simultaneously. <br/>
378
369
> -**Incremental Saves**: Save partial changes without publishing them live. <br/>
@@ -406,8 +397,8 @@ graph TD
406
397
- Use pull requests to review and merge changes from feature branches to the collaboration branch.
407
398
- Collaborate with team members through code reviews and comments.
408
399
409
-
###Resource Management
410
-
>
400
+
## Resource Management
401
+
411
402
> Optimize resource usage to improve performance and reduce costs.
412
403
413
404
|**Best Practice**|**Description**|**Example**|
@@ -417,8 +408,8 @@ graph TD
417
408
|**Cost Management**| Implement cost management practices to control expenses. | - Use Azure Cost Management to monitor and manage ADF costs. <br/> - Set budgets and alerts to avoid unexpected expenses. <br/> - Review and optimize the use of Data Integration Units (DIUs) to balance cost and performance. |
418
409
|**Resource Tagging**| Tag resources for better organization and cost tracking. | - Apply tags to ADF resources to categorize and track costs by project or department. <br/> - Use tags to identify and manage resources associated with specific business units. <br/> - Implement tagging policies to ensure consistent resource tagging across the organization. |
419
410
420
-
###Testing and Validation
421
-
>
411
+
## Testing and Validation
412
+
422
413
> Regularly test and validate your pipelines to ensure they work as expected.
423
414
424
415
|**Best Practice**|**Description**|**Example**|
@@ -428,8 +419,8 @@ graph TD
428
419
|**Validation Activities**| Use validation activities to check data quality and integrity. | - Add a validation activity to verify the row count or data format after a Copy Data activity. <br/> - Implement data quality checks to ensure data accuracy and completeness. <br/> - Use custom scripts or functions to validate complex data transformations. |
429
420
|**Automated Testing**| Automate testing processes to ensure consistency and reliability. | - Use Azure DevOps pipelines to automate the testing of ADF pipelines. <br/> - Schedule automated tests to run after each deployment or code change. <br/> - Integrate automated testing with CI/CD pipelines to ensure continuous validation. |
430
421
431
-
###Documentation
432
-
>
422
+
## Documentation
423
+
433
424
> Maintain comprehensive documentation for your pipelines.
434
425
435
426
|**Best Practice**|**Description**|**Example**|
@@ -439,8 +430,8 @@ graph TD
439
430
|**Annotations**| Use annotations within ADF to provide context and explanations. | - Add annotations to activities to describe their function and any important details. <br/> - Use comments to explain complex logic or business rules within the pipeline. <br/> - Highlight key parameters and settings with annotations for easy reference. |
440
431
|**Knowledge Sharing**| Share documentation with the team to ensure everyone is informed. | - Use a shared platform like SharePoint or Confluence to store and share documentation. <br/> - Conduct regular training sessions to keep the team updated on best practices. <br/> - Encourage team members to contribute to and update the documentation. |
441
432
442
-
###Regular Updates
443
-
>
433
+
## Regular Updates
434
+
444
435
> Keep your pipelines and ADF environment up to date.
445
436
446
437
|**Best Practice**|**Description**|**Example**|
@@ -450,8 +441,8 @@ graph TD
450
441
|**Dependency Management**| Keep dependencies up to date to avoid compatibility issues. | - Update linked services and datasets to use the latest versions of data sources. <br/> - Regularly review and update external dependencies like libraries and APIs. <br/> - Ensure compatibility between ADF and other integrated services. |
451
442
|**Security Patches**| Apply security patches promptly to protect against vulnerabilities. | - Monitor security advisories and apply patches to ADF and related services. <br/> - Implement a patch management process to ensure timely updates. <br/> - Conduct regular security assessments to identify and address vulnerabilities. |
452
443
453
-
###Performance Tuning
454
-
>
444
+
## Performance Tuning
445
+
455
446
> Continuously monitor and tune performance.
456
447
457
448
|**Best Practice**|**Description**|**Example**|
@@ -471,6 +462,14 @@ graph TD
471
462
-[A categorized list of Azure Data Factory tutorials by scenarios](https://learn.microsoft.com/en-us/azure/data-factory/data-factory-tutorials)
472
463
-[Full list of Data Factory trainings](https://learn.microsoft.com/en-us/training/browse/?expanded=azure&products=azure-data-factory)
0 commit comments