Microservice Orchestration with Fleet on Azure
Micro-service architecture, the concept of providing a large scale service as a collection of small, highly available services is becoming increasingly popular in today’s modern cloud service deployments. However, with the rise of Docker and other containerization and orchestration technologies, it can be difficult to get started.
In this post, we’ll explore how we collaborated with two Italian tech companies for a week to show how Fleet, a distributed init system and CoreOS, an operating system designed for clusters can be deployed on Microsoft Azure as a collection of high-availability ‘Dockerized’ microservices.
Docebo.com and TOK.tv are two Italian tech companies whose infrastructure currently runs on Amazon Web Services. Docebo.com is an E-Learning platform and hosts instances of its application for various organizations. TOK.tv is the voice social platform designed to help soccer teams offer their fans a unique engagement experience and powers the Real Madrid and Juventus apps, integrating its Social Button and allowing fans of the clubs to talk with voice, send stadium sounds and take social selfies.
Docebo.com already uses Docker containers for its customer deployments and has customized orchestration infrastructure for their instances. However they eventually want to run their infrastructure on CoreOS and will be looking at Fleet for controlling its deployments on Azure.
Tok.tv at the time did not have any containerized services and wanted an orchestration solution that was transparent and flexible enough to run their app services. In particular Tok.tv is primarily composed of a PHP web application API and a real-time voice service.
Overview of the Solution
We worked face to face with the partners to create a prototype solution of deploying Dockerized microservices across a cluster of CoreOS machines using fleet. Fleet is a distributed init system which can deploy arbitrary processes across a cluster of machines. Often it’s used to deploy Docker containers. It contains settings to ensure that a service runs on separate nodes within the cluster to ensure high availability. To provision CoreOS we used an Azure CoreOS resource manager template which allows easy creation of a CoreOS cluster with an external load balancer.
We used a private Docker registry to deploy the Docker images for service deployment. Private registries are useful when you don’t want to expose the internals of your service infrastructure as open source images on the Docker Hub. Although Docker Hub hosts private registries for a monthly fee, it is also easy to host your own private registry on premesis or on other cloud platforms such as Microsoft Azure. These registries can hold images of all your deployment services, from support services like HTTP servers to application-level images for things like websites and background workers.
The private registry server is deployed from the Docker Hub and its image store is backed by Azure blob storage. This makes our Docker registry ‘portable’ allowing us to deploy services using Fleet by pushing images to our private repository from anywhere with access to Azure storage, including your local machine:
Figure 1. Service Deployment via Azure Storage-backed Private Docker Registry
In the image above, we use our no-cost single Docker Hub repository to store the Docker image of our registry server. The server image is built with our Azure storage credentials and can be run from anywhere with an Internet connection.
This enables the architecture in Figure 1, which allows us to deploy our registry as a Fleet global service, which basically means one registry server runs on every server. This is because Docker requires images in private repositories be tagged with the name of their host. By hosting a light-weight registry server on each CoreOS instance we greatly simplify deploying images.
Once our registry service is deployed across the cluster with Fleet, we can deploy any image within that registry as a service on the cluster. The graphic above depicts the scenario of a developer, running a registry on their local machine and pushing an image to Azure storage via the server. The service is then restarted and the new image is deployed to the appropriate nodes.
Using an external image store allows us to run a private Docker image repository without having to worry about SSL and authenticating users to our repository. We also get the advantage of the redundancy and backup features of Azure Storage.
CoreOS uses etcd, a demon service that is essentially a distributed key-value store that allows services to announce their presence by publishing keys with a Time-to-live (TTL). Each deployed application instance is deployed with an accompanying ‘announcer’ service which is simply a bash script that periodically writes its host ip address and port number for it’s corresponding web server.
Figure 2. Service Discovery with ‘Sidekick’ Services & Etcd
Fleet allows us to specifically define announcer or ‘sidekick’ services, which are guaranteed to run on the same machine as the service it monitors. When the accompanying application service goes down so does the sidekick service. Because etcd has time-to-live values for each key etcd keeps up-to-date in the case of application failures, updates and restarts since the sidekick will stop writing to etcd.
Figure 2 shows each router instance subscribes to etcd and it uses published keys in etcd to build its routing and load balancing rules via confd templating. For example, App1 exists on the first two VMs and between the two instances their corresponding ‘sidekick’ services update the service directory key App1 with the key-value pairs @1: 192.168.1.1:3000 and @2 192.168.1.2:3000. Routing and load-balancing within the cluster is done via Nginx, a popular high performance web server. Because the nginx router is subscribed to etcd it automatically rebuilds its routing template. Requests going for app1 will route to either of these two machines in a round-robin load-balancing manner. The same goes for App2 – this design allows for any number of applications to be deployed backed by any number of instances.
Here’s what the App1 service file looks like:
# example-app@.service # the '@' in the file name denotes that this unit file is a model Description=Example High-Availabilty Web App After=router.service [Service] EnvironmentFile=/etc/environment ExecStartPre=/usr/bin/docker pull localhost:5000/example-app:latest ExecStart=/usr/bin/docker run --name example -p 3000:3000 localhost:5000/example-app:latest ExecStop=/usr/bin/docker stop example ExecStopPost=/usr/bin/docker kill example ExecStopPost=/usr/bin/docker rm example [X-Fleet] Conflicts=example-app@*.service
This Unit File refers to the docker image example-app in our private azure docker registry. To start this unit we simply do:
# upload the service model to fleet fleetctl submit example-app@.service # start an instance of the service fleetctl start firstname.lastname@example.org
The X-Fleet Conflicts tag in the unit file instructs fleet that we don’t want more than one of this unit, an instance of the service, running on the same machine in order to have high availability.
Routing & Load Balancing
The entire cluster sits behind a load balancer and has one public virtual IP address. This public IP address points to an Azure load balancer, which serves requests to any of our nodes.
One advantage of using an external load balancer for the cluster is that you only need one public IP address, rather than a unique one for each instance. Cloud service providers such as Microsoft Azure have limits to public IP allocation per subscription. Also, as machines go up and down, the public IP address doesn’t change and simplifies DNS zone file settings.
Figure 3. Azure Load Balancer spreads loads to each internal NginX instance
When a user request comes in, because we have the router running and listening on port 80 and port 443 on each node, we can handle the request no matter what node it comes to. Further, because of etcd service discovery each router has knowledge about where all the services are located and can route the request the appropriate container.
Figure 4. Internal NginX instance routes request and load balances between app containers
This means that the actual container, which provides the service, doesn’t necessarily need to be on the machine that the Azure load balancer selects, as show in Figure 5.
Figure 5. The Machines Selected by Azure Load Balancer doesn’t need to actually run requested App
Furthermore, the Azure load balancer load balances requests amongst our router, but each router service unit load balances the containers for each service.
This prototype solution we built during the hackfest will be the basis of Tok.tv and Docebo’s microservice architecture.
How to Deploy High Availability Apps to Azure using CoreOS with Fleet & etcd – A getting started guide & code sample for Docker Orchestration w/ Fleet
Azure Storage driver for Docker Registry Server
Opportunities for Reuse
The Fleet bootstrapper guide can be reused for anyone who wants to use portable, private Docker registries along with Fleet to deploy high-availability apps across a CoreOS cluster.