Part 3 – Unlock the Power of Azure Data Factory: A Guide to Boosting Your Data Ingestion Process
To see a complete introduction to this blog series, including links to all the other parts, please follow the link below:
Part 1 – Unlock the Power of Azure Data Factory: A Guide to Boosting Your Data Ingestion Process
Part 3 of the blog series will focus on:
- The YAML Pipeline Structure
- The Publish Process
- ARM Template Parameterization
- ADF ARM Template Deployment
Access to all files is in GitHub.
The YAML Pipeline Structure
To create the YAML pipeline for publishing data factory artifacts and then deploying those artifacts to a specific environment (dev, staging, production), we will use Azure DevOps Pipelines in this section. However, we will also include in this blog series a part that describes how to accomplish the publish and deployment using GitHub workflows and actions.
The YAML pipeline structure consists of user-defined variables and stages to: Publish Artifacts and Deploy Artifacts. To completely understand the publishing concept, refer to Part 2 of this blog series, under the section called “Publishing Concept for Azure Data Factory”. The main take away is that an instance of Azure Data Factory runs in a “live mode” or “data factory mode” and at the same time can have Git configured so that branches of ADF JSON files can be utilized for development. The publishing process creates an ARM Template file that can then be used for deploying the ADF. In our example we have a stage for deploying the artifacts to each environment (dev, staging, production).
Defining variables provides a convenient way to include data in multiple parts of the pipeline. As a reminder, do not set secret variables in your YAML file. Instead, you can set secret variables in the pipeline settings UI for the pipeline, set secret variables in variable groups or use the Azure Key Vault task to retrieve secrets.
Check out full post and remaining series in the Healthcare and Life Sciences Tech Community here.