September 9th, 2024

Askia, an Ipsos company, achieved faster, reproducible builds with vcpkg

Augustin Popa
Senior Product Manager

vcpkg is a free and open-source C/C++ package manager maintained by Microsoft and the C++ community that runs on Windows, macOS, and Linux. Over the years we have heard from companies using vcpkg to manage dependencies at enterprise-scale. In this interview, I spoke to Dimitri Rochette, a lead developer at Askia, an Ipsos company, about vcpkg’s impact on their team.

A logo of a company Description automatically generated

After switching to vcpkg and CMake, Askia was able to reduce build times for their dependencies by using the vcpkg binary caching feature and reduce their code size by about 300,000 lines by eliminating project files, scripts, and copied dependencies. Switching to vcpkg also allowed them to ensure consistent and reproducible builds across multiple platforms, access a vast array of libraries, streamline their onboarding process as their development team grew, and standardize the use of microservices in a multi-repository environment.

Q: Can you tell us about your company and team?

Askia, a software company with two decades of experience, specializes in the development of tools for survey creation, response collection, and data analysis. In 2020, Ipsos, a global leader in market research, acquired Askia with the aim of collaboratively developing a platform that aligns with their ambitious goals. At the time of acquisition, Askia was a team of 30. However, with substantial investment from Ipsos, the team expanded rapidly to 90 members within just two years.

Q: Can you tell us about your C++ development environment?

All our teams utilize the latest Visual Studio 2022. For C++ development, we employ two compilers: MSVC and GCC13. Our projects are all built with CMake and vcpkg. We’ve also implemented additional setup scripts in PowerShell 7 (PWSH) to ensure portability between Windows and Linux. These scripts handle the setup of developer and CI machines, GitHub tokens, NuGet caching, the initial cloning of vcpkg, and database creation for clusters or unit tests. We’re quite satisfied with CMake, despite the initial setup being challenging. The integration with Visual Studio continues to improve, and maintaining a single project for two platforms has been a significant advantage.

Within our project scope, the majority of our code is written in C++. We initially started with C++17 and have now transitioned to C++23. We strive to stay updated with the latest stable compilers for MSVC and GCC. Occasionally, we face challenges waiting for implementations to be available on both compilers. C++ code is on our critical path, and we aim to minimize latency as much as possible. We also have numerous services in C#.Net8, which is the most commonly used language company wise. We have a significant amount of T-SQL code as we heavily rely on SQL Server for our C++ services. For the frontend, we took a chance on Svelte before its 1.0 release, and we are extremely satisfied with the outcome today.

Our development environment is based on Windows, while our production target is Ubuntu 22.04, housed within Docker containers. Despite this, all our code is compatible with Windows and is compiled using MSVC. This allows us to exchange services between both platforms.

Our Docker container is structured in four layers, all built on Ubuntu 22.04:

  • Base Image: This includes additional apt repositories, security measures, and an ODBC driver.
    • Dev: This layer is equipped with all the necessary tools for developers. It’s used as a workspace by developers and by the continuous integration (CI) system to execute unit tests.
      • Static Analysis: This layer is used by the CI system for static analysis and includes SonarCloud and its prerequisites.
    • Production: This is the image that runs the release version. It’s used by the CI system to install binaries of the Dev build on a clean image and is then uploaded for Kubernetes consumption.

We utilize the same images for both developers and CI, leveraging vcpkg binary caching (with GitHub serving as a NuGet server). This means that each build by a developer or CI either downloads or builds and uploads the result to the shared cache. We have 274 packages, with the main ones being downloaded 300 times per month. The ability to save even a single compilation of packages like OpenSSL or Python is greatly appreciated by the team and reduces our CI build times and costs.

Our current codebase is relatively compact, comprising approximately 40,000 lines of C++ code, which are divided across various repositories for services and components. Our legacy codebase is significantly larger, containing more than 2 million lines of C++ code. The introduction of CMake and vcpkg to our main repository resulted in a reduction of about 10% (or 300,000 lines), which included project files, scripts, and copied dependencies.

We utilize GitHub as our continuous integration system. Given that we have scripts to configure developer tools and machines, as well as standard Docker images, we leverage these in our CI environment. The same image used by developers for their work is also used in GitHub CI. Additionally, on GitHub, we employ the other layers of our Docker container for building our release version and executing unit tests.

Q: What kinds of C++ libraries do you consume?

For first-party private/internal libraries, we have established a vcpkg registry with our shared library. Those packages are allowed to use any other library provided by vcpkg. This includes a wrapper for asynchronous messaging (librabbitmq), a wrapper for logging (spdlog), a configuration file with an override system (nlohmann-json), our HTTP server (boost-beast), plugins for our HTTP server (boost asio, curl, fmt), a JWT token checker (jwt-cpp, cppcodec, OpenSSL), and Python scripts (boost-python, python3). The most crucial package for us is our CMake scripts, which define all output folders, all parameters per platform (compiler, C++ version, debug/release, Windows SDK, UNICODE, _UNICODE, NOMINMAX, etc.), our CMake function for tagging versions with git hash and version files, branch name, scripts to create zip or NuGet as output of build, and scripts to copy dependency DLLs.

As for open-source dependencies, we directly utilize boost, fmt, gtest, magic-enum, nanodbc, nlohmann-json, spdlog, pybind11, librabbitmq, openssl, and libcurl.

We also have a significant amount of MFC and some third-party controls in our legacy codebase. In our new codebase, LaunchDarkly is used by C# projects, so it may be used in C++ in the future.

Q: How were you managing C++ dependencies before vcpkg?

In our legacy stack, which is based on the Windows GUI in MFC, we had a variety of methods for managing dependencies. These included:

  • Including them as source
  • Including them as source with modifications
  • Using public NuGet packages with MSBuild projects
  • Including them as pre-compiled binaries

The system was functional, but it was challenging to keep track of what was used and in which version. The ABI break before 2015 served as a useful reminder to consider upgrading our NuGet packages. Our initial inventory of repository candidates for the CMake/vcpkg migration took some time, but all the standard ones were relatively easy to identify.

Q: When did your team move to vcpkg and why did you ultimately choose it?

I had previously used vcpkg at another company to develop a Windows/Linux Qt tool. The use of CMake/vcpkg was a lifesaver, eliminating the need to manage both Visual Studio and Xcode projects. When I joined the company in 2019, I found the build process to be quite cumbersome. We were heavily reliant on Jenkins, with numerous scripts and file copies pre/post-build. Many projects required relative builds of other projects. My first proof of concept was converting some projects to CMake, which involved converting NuGet dependencies to vcpkg. The integration of cmakeconfig.json into Visual Studio 2019 facilitated user acceptance. We maintained both sln + CMake in the repository for about a year in case we encountered any issues.

The acquisition by Ipsos brought significant changes. The need for scalability necessitated major modifications. We had to develop new microservices, and as the team became more familiar with the advantages of CMake/vcpkg, we made vcpkg mandatory for all new projects. The primary goal at this stage was to expedite the onboarding of new developers. We aimed to minimize the steps required to launch and debug a unit test as much as possible. We have iteratively improved this process, and today, the steps take less than half a day.

The steps include:

  1. Installing Visual Studio with C++
  2. Installing ODBCDriver
  3. Cloning vcpkg
  4. Creating a GitHub token for binary caching

With these steps, all our C++ projects could be cloned, compiled, and debugged in unit tests. If you install Docker and create the reference image, you can target both Windows and Linux.

So, yes, vcpkg has been instrumental in our onboarding process. We have tripled the team size in a few years. Standardizing the tools helps developers working on one repository to easily transition to another.

Q: What is your overall impression of vcpkg?

vcpkg has been a game-changer in my 25 years of working with C++. It has revolutionized the way we manage dependencies by ensuring consistent and reproducible builds across multiple platforms. With a vast array of libraries at our disposal, we have the flexibility to create or extend libraries as needed. The caching feature significantly speeds up our build process. Moreover, vcpkg has streamlined our onboarding process and standardized the use of microservices in a multi-repository environment.

Learn More About vcpkg

If you want to learn more about vcpkg, check out our website at vcpkg.io and read the vcpkg overview in our documentation.

If you have a story you would like to share with us about your experiences with vcpkg, feel free to contact us at vcpkg@microsoft.com. You can submit bug reports in our GitHub issue tracker or make feature requests in our discussion forum.

Category
C++Vcpkg
Topics
vcpkg

Author

Augustin Popa
Senior Product Manager

Product manager on the Microsoft C++ team, currently working on vcpkg.

0 comments