Sometimes we need to use the processing power of an HDInsight cluster for infrequent processes or events. In such cases, a constant availability of the HDInsight cluster is redundant, and scale-down options do not enable scaling down completely. To optimize costs it is possible to reduce the availability of the HDInsight cluster as long as the initial ramp up time that can take 15~30 minutes is an acceptable price.
The deployment for this solution uses an HDInsight cluster with WASB storage (HDFS-compatible Azure blob storage). The separation of compute (HDInsight cluster) and storage (WASB), has two important benefits:
- Storage is a cheaper resource than compute
- This approach enables us to destroy and then create the compute resource while maintaining the relevant data.
We encountered such a case with a startup we collaborated with, in which case, once a week, a large batch of files that requires processing was received. The rest of the week the HDInsight cluster remained unused.
During the investigation for a solution we came across two solutions:
- Based on Azure Automation Services and relatively Straightforward.
- Using App Services and API exposure.
Azure Automation Based Solution
The first solution we came up with leverages Azure Automation together with web hooks to enable the creation and destruction of HDInsight cluster. You can use the following technical resources to create an Azure Automation Service that startsstops an HDInsight cluster:
- Automate Provisioning of HDInsight Clusters with PowerShell and Azure Automation
- Importing Runbooks into Azure Automation Service
- PowerShell for controlling HDInsight cluster
Pros
- Easily achievable out of the box and simple to ramp up
- No coding is required if using an existing, predefined runbook
- Azure Automation Service creation automatically creates an associated Service Principal with admin permissions
Cons
Although this solution is simple and easy to ramp-up, it is difficult to monitor the execution of the automation jobs. A webhook is a one-way triggering mechanism that doesn’t provide a method for asserting the success of an execution. Although it is possible to get a status on an on-going job execution by monitoring its resulting data (like logs, tables, blobs, etc.), it is simpler and more straight-forward to follow the instructions constituting the next solution.
Using this solution does not enable the full automation of the process. If you wish to monitor the creation deletion processes and alert on success failure, the next, holistic approach, is a better solution.
App Services Based Solution
This approach uses a Web Service that uses ARM to perform actions on resources in the account. This solution is applicable in one of two approaches:
- Manageable Solution – All management actions are exposed and consumable by an external REST API
- Encapsulated Solution – The overall orchestration of the resources is encapsulated and hidden from the user
Architecture
Manageable Solution
Encapsulated Solution
What’s Common Between Both Solutions?
Both solutions rely on the following parts:
Orchestrating Service App
Whether the service app is a Web App that exposes an API for external management or a Function App that automates the entire orchestration process, this solution relies upon a deployment of a Service App.
Service Principal
The implementation of the Service App uses a privileged Service Principal (SP) to manage the resources in play. The SP is an Azure Active Directory account that can be assigned leveraged permissions and used by services to access subscription account.
var msRestAzure = require('ms-rest-azure');
var resourceManagement = require("azure-arm-resource");
var ResourceManagementClient = resourceManagement.ResourceManagementClient;
msRestAzure.loginWithServicePrincipalSecret(clientId, secret, domain, function (err, credentials) {
var resourceClient = new ResourceManagementClient(credentials, subscriptionId);
resourceClient.resources.createOrUpdate('...');
});
Queue
A queue that collects the jobs to run on the HDInsight cluster.
Proxy Function App
A function app that’s activated (as opposed to created destroyed) only when the HDInsight cluster is running, and supplies all the jobs in the queue to be processed by the cluster. The role of this resource might have been integrated into the orchestration service app, but this approach removes the need to handle queue monitoring, triggering, querying and faults.
HDInsight Cluster
This cluster, which is not part of the deployment, is created on-demand by the orchestration service app in one of two conditions:
- Manageable Solution – An API call was made to trigger the creation of the cluster
- Encapsulated Solution – The queue of jobs is not empty
Manageable Solution Differentiator
This approach is the more flexible of the two, but also requires a lot of orchestration to be built around it. The Management API performs all the actions like insert items to queue, start stop proxy app, create destroy HDInsight cluster, but the API calls need to be orchestrated externally.
Encapsulated Solution Differentiator
This solution uses a Function App with a Scheduled Trigger that runs once every X minutes. Each iteration checks the state of the resources in the system and performs all the necessary tasks for items to be processed properly by the cluster. Another web app was added to this solution to provide a one point access which enables insertion of new items into the queue.
The orchestration app checks the following items:
- Number of items in the queue
- Running state of HDInsight cluster
- Provisioning state of HDInsight cluster
- Running state of Proxy function app
- Number of running jobs in HDInsight cluster
The orchestration app uses this information to decide what task to perform. In addition, actions like stop proxy app and destroy cluster, will be kept pending for about 15 minutes (configurable) to ensure relevancy before initiation.
About the Repository
The implementation in the repositories combines both approaches:
- The Management API exposes all relevant API for maintaining the resources in the deployment
- The orchestrator automatically manages those resources as well
You can choose the approach most relevant for you by omitting the excess parts.
Opportunities for Reuse
This solution uses ARM REST API to orchestrate HDInsight cluster and Function Apps creation deletion activation deactivation, but this method can be applicable to any App Service or Node.js Application that requires an orchestration of azure subscription resources.
0 comments