Moving a project to C++ named Modules

Cameron DaCamara

There is a lot of hype (and perhaps restraint) to using modules in projects. The general blocker tends to be build support, but even with good build support there is a distinct lack of useful resources for practices around moving projects to using named modules (not just header units). In this blog we will take a small project I created, analyze its components, draft up a plan for modularizing it, and execute that plan.

Overview

Tools used

For the purposes of this project, we will be using the following tools:

  • CMake – Version: 3.20.21032501-MSVC_2. Note: this is the installed version of CMake which comes with Visual Studio 2019.
  • Visual Studio 2019 – Version: 16.11.

Project description

I remember when I was younger, I used to love doing kid things like eating terrible fast food, but going to these restaurants had an additional perk: the play places! One of my favorite things to do was go to the ball pit, dive in, and make a giant splash of color.

Ball pit with playground slide

I shudder to think of going into one nowadays, but I have not forgotten how much fun they were. I have also recently become very inspired by OneLoneCoder on YouTube and his series on programming simple physics engines. I decided I would try to take this simple physics engine and make something a little bit fun and a lot more colorful, introducing “Ball Pit!”:

Image ball pit

“Ball Pit!” is a quite simple program built using the following discrete components:

  • OneLoneCoder PixelGameEngine (PGE) – Drives graphics.
  • A simple physics engine for managing all the objects on screen.
  • A data structure related to handling collisions between objects, a quad-tree.
  • A world object to contain our beautiful orbs.
  • Utilities such as common types and functions on those types.
  • The main game object which is responsible for the primary game loop and polling user input.

Ball Pit! in C++ without modules

Since we established a basic design layout in the previous section, let us see what we can produce using C++20 without any modules whatsoever. Without further ado, here is the code in all its #include glory: Ball Pit! Without modules. The easiest way to build this project is to  use Visual Studio’s open folder support.

Alternatively you can do the following (in a VS2019 developer command prompt):

$ mkdir build & cd build & cmake -G"Visual Studio 16 2019" -Ax64 ..\

Once CMake has generated the solution for you can open it using Visual Studio 2019, use the familiar F5 loop and off you go!

Traditional C++ Structure

Let us talk briefly about the traditional project structure of this code. We have the following, familiar, breakdown:

ball_pit/
├─ include/
├─ src/

As you might expect the include/ directory is almost a mirror of some files under src/. You also end up with a sizeable set of includes in our primary ball-pit.cpp to pull all the pieces together:

#include "bridges/pge-bridge.h"

#include "physics/physics-ball.h"
#include "physics/physics-engine.h"
#include "physics/quad-tree.h"
#include "util/basic-types.h"
#include "util/enum-utils.h"
#include "util/random-generator.h"
#include "world/world.h"

You might notice that these includes directly reflect the design we set out to have:

  • PGE for graphics: "bridges/pge-bridge.h"
  • Physics engine: "physics/physics-engine.h"
  • Quad-tree: "physics/quad-tree.h"
  • World object: "world/world.h"
  • Utilities: "util/*
  • Main game: (the current source file: ball-pit.cpp)

Since we made the decision to use header files you will notice that we get some declarations like this:

inline RandomNumberGenerator& random_generator()

Where there is a strong desire not to implement this simple function in its own .cpp file for simplicity’s sake, but if you forget the critical inline keyword or, even worse, mark it as static you will not get the behavior you expect from this function.

Another thing which I like to do on my projects is separate 3rd party headers from the rest of the project using these “bridge” header files. The reason is so that I can easily control warning suppression/isolated requirements for that header. The PGE header is isolated into its own bridge called pge-bridge.h.

Finally, for projects which utilize #include as a code sharing mechanism, I like to employ the idea that each header file should stand completely on its own, meaning that if a header uses something like std::vector it cannot rely on that container being introduced through some other header, it must include it itself. This is good practice; it makes maintaining headers minimal as you move them around and use them in more places.

Ungluing from #include

At the beginning it was mentioned that we are using CMake as our configuration system but, as of publishing, CMake’s support for modules is still experimental. What we can do is generate build system output for a build system which does support modules: MSBuild’s! All we need to do is tell MSBuild that there are module interfaces in this project and “Presto!” we have a modules-compatible project! By default, MSBuild will key off any source files with a .ixx extension to automatically support named modules—exactly what we want! Now, how do we get there?

If we examine the include/ tree we get a surprisingly promising idea of what module interfaces we need:

ball_pit/
├─ include/
│  ├─ bridges/
│  │  ├─ pge-bridge.h
│  ├─ physics/
│  │  ├─ physics-ball.h
│  │  ├─ physics-engine.h
│  │  ├─ physics-utils.h
│  │  ├─ quad-tree.h
│  ├─ util/
│  │  ├─ basic-types.h
│  │  ├─ enum-utils.h
│  │  ├─ random-generator.h
│  │  ├─ stopwatch.h
│  ├─ world/
│  │  ├─ world.h

It is common for mature projects to have a similar structure and breakdown of components and it makes sense for maintainability reasons. As a goal for modularizing this project let us aim to remove the entire directory tree of include/ and take advantage of modules as much as possible. Let us do exactly that by introducing some new files into the directory tree which reflects our header file layout (making them empty for now):

ball_pit/
├─ modules/
│  ├─ bridges/
│  │  ├─ pge-bridge.ixx
│  ├─ physics/
│  │  ├─ physics-ball.ixx
│  │  ├─ physics-engine.ixx
│  │  ├─ physics-utils.ixx
│  │  ├─ quad-tree.ixx
│  ├─ util/
│  │  ├─ basic-types.ixx
│  │  ├─ enum-utils.ixx
│  │  ├─ random-generator.ixx
│  │  ├─ stopwatch.ixx
│  ├─ world/
│  │  ├─ world.ixx

Now the process of moving everything over to using modules begins!

Starting small…

When tackling a project of any size you want to start as small as you possibly can. In the case of “Ball Pit!” I started with include/util/enum-utils.ixx because it did not depend on anything besides a STL header. The first thing you need to do is add the content to your module interface:

module;
#include <type_traits>
export module Util.EnumUtils;

template <typename T>
concept Enum = std::is_enum_v<T>;

template <Enum E>
using PrimitiveType = std::underlying_type_t<E>;

template <Enum E>
constexpr auto rep(E e) { return PrimitiveType<E>(e); }

This is almost a 1-to-1 copy-paste of the header but with the following exceptions:

  • Our STL headers are injected into the global module fragment (the region between module; and export module ...).
  • We have given a proper name to our module: Util.EnumUtils. Note: the . separated names do not indicate any filesystem structure.
  • We no longer need header include guards.

There is one last thing missing: we did not actually export anything! Since all these names are used around the project, we need to export everything, and the easiest way to export lots of declarations at once is to use the export { ... } syntax. Take a look:

module;
#include <type_traits>
export module Util.EnumUtils;

export
{

template <typename T>
concept Enum = std::is_enum_v<T>;

template <Enum E>
using PrimitiveType = std::underlying_type_t<E>;

template <Enum E>
constexpr auto rep(E e) { return PrimitiveType<E>(e); }

} // export

The next logical step for us is to replace any instance of #include "util/enum-utils.h" with import Util.EnumUtils;. This part is largely mechanical and to play off guidance around mixing import and #include I ensured to place any import after any #include‘s. Finally, we add this new interface to the CMakeLists.txt here, configure, build and run again. Things should run the same as before except that we are one step closer to modularizing the project!

Choosing visibility

Named modules are all about defining the surface area of your API. Now that we have a tool which allows us to hide implementation details that would otherwise be unnecessary for consumers, we can start to think about what the accessible parts of the API should be. Let us look at modularizing include/util/random-generator.h. In this file we have the following declarations:

enum class RandomSeed : decltype(std::random_device{}()) { };

template <std::integral I>
using IntDistribution = std::uniform_int_distribution<I>;

template <std::floating_point I>
using RealDistribution = std::uniform_real_distribution<I>;

class RandomNumberGenerator
{
   ...
};

inline RandomNumberGenerator& random_generator()
{
   ...
}

Of these declarations the ones we use outside of the header are IntDistribution, RealDistribution, and random_generator() (not even the class name directly). As such we can define the module like so:

export module Util.RandomGenerator;

import Util.EnumUtils;

enum class RandomSeed : decltype(std::random_device{}()) { };

export
template <std::integral I>
using IntDistribution = std::uniform_int_distribution<I>;

export
template <std::floating_point I>
using RealDistribution = std::uniform_real_distribution<I>;

class RandomNumberGenerator
{
    ...
};

export
RandomNumberGenerator& random_generator()
{
    ...
}

Notice that we do not even need to export the declaration of the class RandomNumberGenerator. We do not need its name; we only need its functionality, and we can prevent users from creating extra instances of it by allowing its use through random_generator() only.

Furthermore, we no longer need random_generator() to be marked as inline because there is now only one definition in any given translation unit. Do not be afraid to put compiled code in an interface, it is its own translation unit and obeys the rules of compiled code.

3rd party pain

In C++ we deal with sharing code all the time and a lot of the time that code has a distinctive style, compiler requirements, default warning settings, etc. When we move code into a modules world, and in particular 3rd party code, we need to take some things into consideration: what part of the library do we want to expose? What runtime requirements are in the library if it is header only? Do we want to “seal” off bad parts of the library? With modules we start to have answers to these questions based on the requirements of our project. Integrating 3rd party library functionality into modularized projects is one of the most interesting parts of using modules because modules give us tools we never had before to deal with ODR (One Definition Rule) and name resolution. In this section we will focus on modularizing the include/bridges/pge-bridge.h.

The OneLoneCoder PixelGameEngine is a nice library if you are just starting out exploring games programming. It is easy to integrate into projects (because it is a single header file) and the interfaces are simple–which plays to our advantage in deciding what parts of the library we want to expose. In “Ball Pit!” we use the following functionality from PGE:

  • olc::PixelGameEngine — For the main program.
  • olc::Key — For user input.
  • olc::Pixel — For coloring pixels.
  • olc::vf2d/olc::vi2d — Standard vector classes (float and int respectively).
  • olc::BLACK, olc::WHITE, olc::BLUE, and olc::RED — Color constants.

We can, by default, export each of the above with a using-declaration:

module;
#pragma warning(push)
#pragma warning(disable: 4201) // nonstandard extension used: nameless struct/union
#pragma warning(disable: 4245) // 'argument': conversion from 'int' to 'uint8_t', possible loss of data
#include "olcPixelGameEngine.h"
#pragma warning(pop)
export module Bridges.PGE;

export
namespace olc
{
    // For game.
    using olc::PixelGameEngine;
    using olc::Key;

    // For basic types.
    using olc::Pixel;
    using olc::vf2d;
    using olc::vi2d;

    // Allow using the multiply operator from olc::v2d_generic.
    using olc::operator*;
}

The reason we use a using-declaration is because we do not want the module to own all these objects/functions. By injecting the names through a using-declaration their linkage remains tied to the global module so we can separately compile them in src/3rd_party/olcPixelGameEngine.cpp as before.

You will immediately notice that the color constants are mysteriously missing. This is because these constants are defined with static linkage in the header file so we cannot export them directly and the reason is buried in standardese. It is simpler to remember that you cannot export an internal linkage entity (i.e. one declared static). The way to get around this is wrap them in a function which has module linkage:

export
namespace olc
{
    ...
    // Note: Because these color constants are defined to be static in the header they cannot be
    // directly exported.  Instead we export their values through a module-owned variable.
    namespace ModuleColors
    {
        auto Black()
        {
            return olc::BLACK;
        }

        auto White()
        {
            return olc::WHITE;
        }

        auto Blue()
        {
            return olc::BLUE;
        }

        auto Red()
        {
            return olc::RED;
        }
    }
    ...
}

Once we have these functions, we need to replace any instance of olc::COLOR with its respective call to our exported color function.

And that is it! We have successfully exported exactly what we need from PGE for our “Ball Pit!” app! Just as before, you add this to the CMakeLists.txt, replace #include "bridges/pge-bridge.h" with import Bridges.PGE;.

Polishing with modules

Once you have gone through the exercise of modularizing more and more of the project you might find that your main program begins to reflect the header file version:

import Bridges.PGE;

import Physics.Ball;
import Physics.Engine;
import Physics.QuadTree;
import Util.BasicTypes;
import Util.EnumUtils;
import Util.RandomGenerator;
import World;

Dandy! Modules also give us similar tools as header files do in that we can group common sets of modules together into a “package”. To understand what I am talking about let us look at a header file equivalent of grouping common functionality. Here is what a grouping of all the headers under include/physics/* might look like:

include/physics/physics.h

#ifndef PHYSICS_H
#define PHYSICS_H

#include "physics/physics-ball.h"
#include "physics/physics-engine.h"
#include "physics/physics-utils.h"
#include "physics/quad-tree.h"

#endif PHYSICS_H

The problem, of course, is while this is convenient and you do not need to think about which specific file to include for your current project, you end up paying the cost of every header file in the package regardless of if you use it or not. It flies in the face of C++’s core concept: pay for what you use. With the introduction of C++20 modules we no longer have this problem because modules do next to zero work when you import them, so we can safely create the following interface without negatively impacting the compile time of consumers:

modules/physics/physics.ixx

export module Physics;

export import Physics.Ball;
export import Physics.Engine;
export import Physics.QuadTree;
export import Physics.Utils;

We can also do the same for anything under Util.*. This leads us to a rather, I think, respectable looking ball-pit.cpp:

import Bridges.PGE;

import Physics;
import Util;
import World;

All together now

It was a little bit of a journey getting here, and there are learnings along the way. I will not dillydally any further, here is the complete, modularized, version of “Ball Pit!”: ball_pit. You can check out the code, configure, and build it the same as we covered earlier using Visual Studio 2019 version 16.11.

There is one thing I want to mention, because I can all but guarantee it is on everybody’s mind: what is the build throughput? With modules there is an up-front cost in building our interfaces. With the old inclusion model, we did not have to build our include files explicitly (only implicitly). We end up building more up front, but the result is that we can REPL our main program and its components much, much faster. Here is a snapshot of the difference:

Compiling ball-pit.cpp:

Without modules With modules
3.55275s 0.15413s

Note: these times were an average of 10 runs. You can see the results yourself by observing the c1xx.dll in the build log (left in for comparisons).

Yep, that is a real ~23x speedup difference. That kind of compile time if you’re developing a game can make a dramatic difference if you are wanting to quickly test changes to your game or make mistakes, like I often do :).

Closing

The process of using named modules in complex projects can be time consuming, but this type of refactor pays off in both reducing development costs associated with recompiling and code hygiene. Named modules give us so much more than simply better compile times and in the above we have only scratched the surface of what is possible. Stay tuned for more modules educational content from us in the future!

We urge you to go out and try using Visual Studio 2019/2022 with Modules. Both Visual Studio 2019 and Visual Studio 2022 Preview are available through the Visual Studio downloads page!

As always, we welcome your feedback. Feel free to send any comments through e-mail at visualcpp@microsoft.com or through Twitter @visualc. Also, feel free to follow me on Twitter @starfreakclone.

If you encounter other problems with MSVC in VS 2019/2022 please let us know via the Report a Problem option, either from the installer or the Visual Studio IDE itself. For suggestions or bug reports, let us know through DevComm.