Part 4 – Unlock the Power of Azure Data Factory: A Guide to Boosting Your Data Ingestion Process
This post is the next post in the series Unlock the Power of Azure Data Factory: A Guide to Boosting Your Data Ingestion Process. This also happens to overlap and is included in the series on YAML Pipelines. All code snippets and final templates can be found out on my GitHub TheYAMLPipelineOne. For the actual data factory, we will leverage my adf_pipelines_yaml_ci_cd repository.
After reading parts 1-3 on Unlock the Power of Azure Data Factory one may be left with the next steps of how to take what was provided and convert it to an enterprise scale. Terminology and expectations are key so let’s outline what we would like to see from an enterprise-scale deployment:
- Write once reuse across projects.
- Individual components can be reused.
- Limited manual intervention.
- Easily updated.
- Centralized definition.
Depending on where your organization is at in your pipeline and DevOps maturity this may sound daunting. Have no fear as we will walk you through how to achieve this with YAML templates for Azure Data Factory. At the end of this piece, you should be well equipped to create a working pipeline for Azure Data Factory in a manner of minutes.
To assist in the goals outlined above for enterprise scale deployments, I recommend having a separate repository for your YAML templates that resides outside of your Data Factory. This will help check off the boxes on a centralized definition, write once and reuse across projects, easily updated, and individual components that can be reused. For more context on this check out my post on Azure DevOps Pipelines: Practices for Scaling Templates.
Our individual Data Factories will each have a dedicated CI/CD pipeline which will reference the separate repository we are putting the YAML templates in. This can be achieved natively in Azure DevOps. Furthermore, it is not unheard of for larger scale organizations to have a “DevOps Team” or a team responsible for pipeline deployments. If this is the case in your organization, you can think of this other team as “owning” the centralized repository.
Check out full post and the series in the Healthcare and Life Sciences Tech Community here.