June 10th, 2014

The Great C Runtime (CRT) Refactoring

(This is the first of two articles on changes to the C Runtime (CRT) in the Visual Studio “14” CTP. This article discusses the major architectural changes to the libraries; the second article will enumerate the new features, bug fixes, and breaking changes.)

For the past seven releases of Visual Studio (2002, 2003, 2005, 2008, 2010, 2012, and 2013), the Visual C++ libraries have been versioned and each versioned set of libraries is independent of other versioned sets of libraries. For example, a C++ program built with Visual C++ 2010 using the DLL runtime libraries will depend on msvcr100.dll and msvcp100.dll, while a C++ program built with Visual C++ 2013 will depend on msvcr120.dll and msvcp120.dll.

On the one hand, this model of introducing differently named and completely independent sets of libraries each release makes it a bit easier for us to add new features and fix bugs. We can make breaking changes, e.g. to fix nonconforming or buggy behavior, at any time without worrying about breaking existing software components that depend on already-released versions of these libraries.

However, we have frequently heard from you, our customers, that this model is burdensome and in some cases makes it difficult to adopt new versions of Visual C++ due to dependencies on modules built with an older version of Visual C++ or the need to support plugins built with a particular version of Visual C++.

This problem has grown especially acute in recent years for two reasons. First, we have accelerated the release schedule of Visual Studio in order to make new features available more frequently. Second, it has become very important to support devices smaller than desktops or laptops, like phones, and accumulating multiple copies of very similar libraries on such devices is less than ideal.

Even for us, this model of introducing new versions of the libraries can be painful at times. It makes it very expensive for us to fix bugs in already-released versions of the libraries because we are no longer actively working in the codebases for those versions, so fixes must be individually backported and tested. The result is that we usually fix only serious security vulnerabilities in old versions of the libraries. Other bugs are generally fixed only for the next major version.

We can’t fix the past: the versions of these libraries that have already been released are not going away. But we are going to try to make improvements to this experience for the future. This is a major undertaking and will take some time, but we plan to make gradual process, starting with…

The Refactoring of the CRT

The CRT sits at the bottom of the Visual C++ libraries stack: the rest of the libraries depend on it and practically all native modules depend on it as well. It contains two kinds of stuff: [1] the C Standard Library and various extensions, and [2] runtime functionality required for things like process startup and exception handling. Because the CRT sits at the bottom of the stack, it is the logical place to start the process of stabilizing the libraries.

Starting with Visual Studio “14,” we will stop releasing new versions of the CRT with each release of Visual Studio. Whereas before we would have released msvcr140.dll in this forthcoming release, then msvcr150.dll in the next release, we will instead release one new CRT in Visual Studio “14” then update that version in-place in subsequent releases, maintaining backwards compatibility for existing programs.

We are also working to unify the CRTs used for different platforms. In Visual Studio 2013, we built separate “flavors” of the CRT for different platforms. For example, we had separate CRTs for desktop apps, Windows Store apps, and Windows Phone apps. We did so because of differences in which Windows API functions are available on different platforms.

In Windows Store and Windows Phone apps, only a subset of the Windows API is available for use, so we need to implement some functions differently and cannot implement other functions at all (for example, there’s no console in Windows Store and Windows Phone apps, so we don’t provide the console I/O functionality in the CRT). The CRT for desktop apps must run on all supported operating systems (in Visual Studio 2013 this included Windows XP) and must provide the full set of functionality, including legacy functionality.

In order to unify these different CRTs, we have split the CRT into three pieces:

  1. VCRuntime (vcruntime140.dll): This DLL contains all of the runtime functionality required for things like process startup and exception handling, and functionality that is coupled to the compiler for one reason or another. We may need to make breaking changes to this library in the future.

  2. AppCRT (appcrt140.dll): This DLL contains all of the functionality that is usable on all platforms. This includes the heap, the math library, the stdio and locale libraries, most of the string manipulation functions, the time library, and a handful of other functions. We will maintain backwards compatibility for this part of the CRT.

  3. DesktopCRT (desktopcrt140.dll): This DLL contains all of the functionality that is usable only by desktop apps. Notably, this includes the functions for working with multibyte strings, the exec and spawn process management functions, and the direct-to-console I/O functions. We will maintain backwards compatibility for this part of the CRT.

While I’ve named the release DLLs in the list, there are also equivalent debug DLLs and release and debug static CRT libraries for each of these. The usual lib files (msvcrt.lib, libcmt.lib, etc.) are built such that the newly refactored CRT is a drop-in replacement for the old CRT at build-time, so long as /nodefaultlib is not used.

While we have retained the version number in the DLL for this CTP, we plan to remove it from the AppCRT and DesktopCRT before the final release of Visual Studio “14,” since we will be updating those DLLs in-place. Finally, we are still working on the final packaging of functionality, so we may move things among the DLLs before the final release.

Windows Store and Windows Phone apps will be able to use the functionality from the VCRuntime and the AppCRT only; desktop apps will be able to use all of that functionality plus the functionality from the DesktopCRT. In this first Visual Studio “14” CTP, all apps depend on all three parts of the refactored CRT; this is merely a temporary state of affairs that will be fixed eventually.

The Problem of Maintainability

One of the biggest problems that we had to solve in order to consider stabilizing the libraries in this way was the problem of maintainability. The CRT is a very old codebase, with many source files dating back to the 1980s. In many parts of the code, optimization techniques that were valid and useful decades ago not only obfuscated the code and made it difficult to maintain, but also hampered the modern compiler’s ability to optimize the code. In other areas, years of bolted-on features and bug fixes had turned once-beautiful C code into a hideous maintenance nightmare. If we were going to consider stabilizing the libraries so that we could update them in-place, we’d need to improve the maintainability first, otherwise we’d incur great cost to fix bugs and make improvements later.

The “best” example of this maintainability problem could be found in the old implementation of the printf family of functions. The CRT provides 142 different variations of printf, but most of the behavior is the same for all of the functions, so there are a set of common implementation functions that do the bulk of the work. These common implementation functions were all defined in output.c in the CRT sources(1). This 2,696 line file had 223 conditionally compiled regions of code (#ifdef, #else, etc.), over half of which were in a single 1,400 line function. This file was compiled 12 different ways to generate all of the common implementation functions. Even with the large number of tests that we have for these functions, the code was exceedingly brittle and difficult to modify.

This isn’t merely a theoretical problem. In Visual Studio 2013, we added many of the C99 functions that were previously missing (see Pat’s blog post from last year). However, there were a number of things that we were unable to implement. Two of the most conspicuous missing features were [1] the snprintf function and [2] the format string enhancements like the z and t length modifiers for the size_t and ptrdiff_t types. It was late in the product cycle when we started looking at implementing these, and decided we simply could not implement them with confidence that we weren’t breaking anything.

So, as part of this great refactoring of the CRT, we have done an enormous amount of work to simplify and improve the quality of the code, so that it is easier to add features and fix bugs in the future. We have converted most of the CRT sources to compile as C++, enabling us to replace many ugly C idioms with simpler and more advanced C++ constructs. The publicly callable functions are still declared as C functions, of course (extern "C" in C++), so they can still be called from C. But internally we now take full advantage of the C++ language and its many useful features.

We have eliminated most of the manual resource management in the code through the introduction of several special-purpose smart pointer and handle types. Enormous functions have been split into smaller, maintainable pieces. We have eliminated 75%(2) of the conditional compilation preprocessor directives (#ifdef, #else, etc.) by converting internal implementation details to use C++ features like templates and overloading. We have converted most of the CRT source files to use a common coding style.

As part of this work, we’ve completely rewritten the core implementations of the printf and scanf functions (now with no #ifdefs!). This has enabled us to implement the remaining C99 features for the stdio library, to improve correctness checks in the library, and to fix many conformance bugs and quirks. Just as importantly, this work has enabled us to discover and fix substantial performance issues in the library.

Before this refactoring, the sprintf functions, which write formatted data to a character buffer, were implemented by wrapping the result buffer in a temporary FILE object and then deferring to the equivalent fprintf function. This worked and produced the correct result, but it was exceedingly inefficient. When writing characters to a FILE we need to be careful to handle many cases like buffer exhaustion, end-of-line conversions, and character conversions. When writing characters to a string, we should simply be able to write through and increment the result pointer. After the refactoring, we were easily able to identify this performance problem and, more importantly, fix it. The sprintf functions are now up to 8 times faster than they were in previous releases.

This is merely one example of where we’ve done major work and how that work has helped us to improve the quality of the library. In the next article we will enumerate all of the major features, bug fixes, and breaking changes to the CRT in the Visual Studio “14” CTP, similar to what Stephan wrote last week for the STL.

What Next?

We are nearing completion of the CRT refactoring. Undoubtedly there are bugs, and we encourage you to try out the Visual Studio “14” CTP and report any bugs that you find on Microsoft Connect. If you report bugs now, there’s a really good chance that we can fix them before the final release of Visual Studio “14.” We’ve already gotten a few bug reports; thank you to those of you who reported them!

We are investigating opportunities for similar stabilization efforts with other libraries. Given that the separately-compiled STL components (msvcp140.dll) are also very commonly used, we are considering our options for a similar stabilization of that functionality.

Note that near-term we are only considering stabilization of the separately-compiled code. We do not plan to make stability guarantees about any C++ Standard Library types or any inline code in the C++ headers. So, for example, if you pass a std::vector to a function, both the caller and callee will still need to be compiled with the same STL headers and options. There are very long-term efforts to try to come up with a solution to this more general problem; for example, see Herb Sutter’s recent C++ Standardization Committee proposal N4028: Defining a Portable C++ ABI.

James McNellis (james.mcnellis@microsoft.com)
Senior Software Development Engineer, Visual C++ Libraries


(1) We ship most of the sources for the CRT with Visual Studio; you can find them in the Visual Studio installation directory under VCcrtsrc.

(2) In Visual Studio 2013 there are 6,830 #if, #ifdef, #ifndef, #elif, and #else directives in the sources that we ship with the product; in the Visual Studio “14” CTP there are 1,656. These numbers do not include directives in headers and they do include the STL source files which are largely untouched by this refactoring effort, so this isn’t a perfect measurement, but it’s indicative of the amount of cleanup that’s been done.

Category
C++

0 comments

Discussion are closed.