June 18th, 2014

C Runtime (CRT) Features, Fixes, and Breaking Changes in Visual Studio 14 CTP1

(This is the second of two articles on changes to the C Runtime (CRT) in the Visual Studio “14” CTP. The first article, The Great C Runtime (CRT) Refactoring, covered the major architectural changes to the CRT; this second article enumerates the new features, bug fixes, and breaking changes.)

This list covers all of the major changes to the CRT that were made after the Visual Studio 2013 RTM and which are present in the Visual Studio “14” CTP. For a similar list covering changes to the C++ Standard Library, see Stephan’s article from June 6, C++14 STL Features, Fixes, And Breaking Changes In Visual Studio 2014. The changes are grouped by the main CRT header with which they are associated, with one large change to the printf and scanf functions covered first.

In the Visual Studio “14” CTP we have fully implemented the C99 Standard Library, with the exception of any library features that depend on compiler features not yet supported by the Visual C++ compiler (notably, <tgmath.h> is not implemented). There are undoubtedly some remaining conformance issues–we know of a few, including that _Exit is missing and wcstok has the wrong signature, and we are working to fix these. If you find a bug or a missing feature, please do report it on Microsoft Connect. If you report bugs now, there is a very good chance that we’ll be able to fix them before RTM.

Please note that the documentation on MSDN has not yet been updated to include any of the changes covered in these blog posts.

Fixing the Wide String Format and Conversion Specifiers

Updated April 7, 2015:  This feature was reverted in Visual Studio 2015 CTP6; it will not be present in Visual Studio 2015.  We had many customers express concern about this change and we discovered several new issues when working with static libraries.

The biggest “breaking change” to the CRT in the Visual Studio “14” CTP is a change to how the wide string formatted I/O functions (e.g., wprintf and wscanf) handle the %c, %s, and %[] (scanset) format and conversion specifiers.

The wide string formatted I/O functions were first implemented in the Visual C++ CRT in the early 1990s. They were implemented such that the %c, %s, and %[] specifiers mapped to a wide character or string argument. For example, this was the behavior (and remained the behavior through Visual C++ 2013):

 printf("Hello, %s!\n", "World"); // Lowercase s: narrow string printf("Hello, %S!\n", L"World"); // Capital S: wide string wprintf(L"Hello, %s!\n", L"World"); // Lowercase s: wide string wprintf(L"Hello, %S!\n", "World"); // Capital S: narrow string 

This design has the advantage that the %c, %s, and %[] specifiers always map to an argument of the “natural” width for the function call. If you’re calling a narrow string formatted I/O function, they map to a narrow character or string argument; if you’re calling a wide string formatted I/O function, they map to a wide character or string argument. Among other things, this design made it easier to move from use of narrow strings to use of wide strings, via the macros in <tchar.h>.

These functions were later standardized in C99, and unfortunately, the standardized behavior was different. In the C99 specification, the %c, %s, and %[] specifiers always map to a narrow character or string argument. The l (lowercase L) length modifier must be used to format a wide character or string argument. So, per the C99 specification, the following calls are correct:

 printf("Hello, %s!\n", "World"); // s: narrow string printf("Hello, %ls!\n", L"World"); // ls: wide string wprintf(L"Hello, %ls!\n", L"World"); // ls: wide string wprintf(L"Hello, %s!\n", "World"); // s: narrow string 

This design has the advantage that the specifiers always have the same meaning, regardless which function is called. It has the disadvantage that it doesn’t match what was previously implemented in the Visual C++ CRT and it doesn’t work with the macros in <tchar.h>.

In the Visual Studio “14” CTP, we have flipped the meaning of the %c, %s, and %[] specifiers for the wide formatted I/O functions so that their behavior matches what is required by the C Standard. The meanings of the capital letter specifier equivalents (%C and %S) have also been changed, for consistency. In order to facilitate continued use of the <tchar.h> header we have also added a new length modifier, T, that means the argument is of the “natural” width, in effect giving the legacy behavior. So, for example, all of the following calls are correct:

 printf("Hello, %s!\n", "World"); // narrow string printf("Hello, %S!\n", L"World"); // wide string printf("Hello, %ls!\n", L"World"); // wide string printf("Hello, %Ts!\n", "World"); // natural width (narrow) wprintf(L"Hello, %s!\n", "World"); // narrow string wprintf(L"Hello, %S!\n", L"World"); // wide string wprintf(L"Hello, %ls!\n", L"World"); // wide string wprintf(L"Hello, %Ts!\n", L"World"); // natural width (wide) 

This fairly small change has a very large effect on existing code. There are many millions of lines of code that expect the old behavior, and we recognize that we can’t just unconditionally break all of that code. While we encourage you to migrate code to use the conforming format strings mode, we are also providing a compile-time switch to enable you to revert the behavior back to the legacy mode. There are, therefore, two modes:

  • C99 Conformance Mode: In this mode, calls to the wide string formatted I/O functions will get the correct behavior as is required by C99. This mode is enabled by default.

  • Legacy Mode: In this mode, calls to the wide string formatted I/O functions will get the legacy behavior for these three format specifiers, as they were implemented in Visual Studio 2013 and previous versions. To enable this mode, predefine the _CRT_STDIO_LEGACY_WIDE_SPECIFIERS macro when building your program.

This mode is configurable per executable module, so each DLL or EXE can independently specify which mode it requires. This mode is configurable only at compile-time and cannot be changed dynamically. Because the mode is per executable module, all source files that are linked into a single executable module must be compiled for the same mode (i.e., with or without _CRT_STDIO_LEGACY_WIDE_SPECIFIERS defined. If you try to mix-and-match objects at link-time where some objects required the legacy mode and some required the conformance mode, you’ll get a link-time mismatch error.

If you have static libraries and you would like to enable those static libraries to be linked into modules that use either C99 conformance mode or legacy mode, you can do so by doing the following:

  1. Ensure that the code in your static library does not use or otherwise handle (e.g. via pass through) format strings whose behavior is different between the two modes, and

  2. Predefine the _CRT_STDIO_ARBITRARY_WIDE_SPECIFIERS macro when compiling the source files for your static library. This is not another mode; it simply allows those files to be linked into a module using either conformance or legacy mode.

<assert.h>

<fenv.h> and <float.h>

  • _clear87 and _clearfp: In Visual Studio 2013, the _clear87 and _clearfp functions in the CRT for x86 would fail to return the original floating point unit status if the CPU supported SSE2. This has been fixed.

  • fegetenv and fesetenv: In Visual Studio 2013, these functions were incorrectly implemented in the CRT for x86. There were two bugs: [1] a call to fegetenv would cause any pending, unmasked x87 floating point exceptions to be raised, and [2] the fegetenv function would mask all x87 floating point exceptions before returning and would thus return incorrect state. Because the fesetenv function uses the same underlying logic, it was impacted by these issues as well. Both of these issues have been fixed.

  • feholdexcept: In Visual Studio 2013, the feholdexcept function failed to mask all floating point point exceptions before returning. This has been fixed.

  • FLT_ROUNDS: In Visual Studio 2013, the FLT_ROUNDS macro expanded to a constant expression, which was incorrect because the rounding mode is configurable at runtime, e.g. via a call to fesetround. The FLT_ROUNDS macro is now dynamic and correctly reflects the current rounding mode (Connect #806669).

  • /fp:strict Support: In Visual Studio 2013, if <fenv.h> was included in a C source file and that source file was compiled with /fp:strict, the source file would fail to compile due to the presence of floating point arithmetic in a static initializer in an inline function in <fenv.h>. This has been fixed (Connect #806624).

  • The following macros have been added to <float.h>: FLT_DECIMAL_DIG, FLT_HAS_SUBNORM, FLT_TRUE_MIN, DBL_DECIMAL_DIG, DBL_HAS_SUBNORM, DBL_TRUE_MIN, LDBL_DECIMAL_DIG, LDBL_HAS_SUBNORM, and LDBL_TRUE_MIN.

<inttypes.h>

  • Format and conversion specifier macros can now be used with wide format strings: In Visual Studio 2013, the format and conversion specifier macros in <inttypes.h> were defined in such a way that they were unusable in wide format strings. This has been fixed (StackOverflow #21788652).

<math.h>

<stdio.h> and <conio.h>

  • Conforming Wide Format Specifiers: See the first section of this article for a lengthy description of the changes that have been made to the %c, %s, and %[] (scanset) format and conversion specifiers.

  • The printf and scanf functions are now defined inline: In order to support the two wide string format and conversion specifier modes, the definitions of all of the printf and scanf functions have been moved inline into <stdio.h>, <conio.h>, and other CRT headers. This is a breaking change for any programs that declared these functions locally without including the appropriate CRT headers. The “fix” is to include the appropriate CRT headers.

  • Format and Conversion Specifier Enhancements: The %F format/conversion specifier is now supported. It is functionally equivalent to the %f format specifier, except that infinities and NaNs are formatted using capital letters.

    The following length modifiers are now supported:

    • hh: signed char or unsigned char
    • j: intmax_t or uintmax_t
    • t: ptrdiff_t
    • z: size_t
    • L: long double

    In previous versions, the implementation used to parse F and N as length modifiers. This behavior dated back to the age of segmented address spaces: these length modifiers were used to indicate far and near pointers, respectively, as in %Fp or %Ns. This behavior has been removed. If %F is encountered, it is now treated as the %F format specifier; if %N is encountered, it is now treated as an invalid parameter.

  • Infinity and NaN Formatting: In previous versions, infinities and NaNs would be formatted using a set of Visual C++-specific sentinel strings:

    • Infinity: 1.#INF
    • Quiet NaN: 1.#QNAN
    • Signaling NaN: 1.#SNAN
    • Indefinite NaN: 1.#IND

    Any of these may have been prefixed by a sign and may have been formatted slightly differently depending on field width and precision (sometimes with unusual effects, e.g. printf("%.2f\n", INFINITY) would print 1.#J because the #INF would be “rounded” to a precision of 2 digits). C99 introduced new requirements on how infinities and NaNs are to be formatted. We have changed our implementation to conform to these new requirements. The new strings are as follows:

    • Infinity: inf
    • Quiet NaN: nan
    • Signaling NaN: nan(snan)
    • Indefinite NaN:nan(ind)

    Any of these may be prefixed by a sign. If a capital format specifier is used (e.g. %F instead of %f) then the strings are printed in capital letters (e.g. INF instead of inf), as is required (Connect #806668).

    The scanf functions have been modified to parse these new strings, so these strings will round-trip through printf and scanf.

  • Exponent Formatting: The %e and %E format specifiers format a floating point number as a decimal mantissa and exponent. The %g and %G format specifiers also format numbers in this form in some cases. In previous versions, the CRT would always generate strings with three-digit exponents. For example, printf("%e\n", 1.0) would print 1.000000e+000. This was incorrect: C requires that if the exponent is representable using only one or two digits, then only two digits are to be printed.

    In Visual Studio 2005 a global conformance switch was added: _set_output_format. A program could call this function with the argument _TWO_DIGIT_EXPONENT, to enable conforming exponent printing. This conformance switch has been removed and the default behavior has been changed to the standards-conforming exponent printing mode.

  • %A and %a Zero Padding: The %a and %A format specifiers format a floating point number as a hexadecimal mantissa and binary exponent. In previous versions, the printf functions would incorrectly zero-pad strings. For example, printf("%07.0a\n", 1.0) would print 00x1p+0, where it should print 0x01p+0. This has been fixed.

  • Floating Point Formatting and Parsing Correctness: We have implemented new floating point formatting and parsing algorithms to improve correctness. This change affects the printf and scanf families of functions, as well as functions like strtod.

    The old formatting algorithms would generate only a limited number of digits, then would fill the remaining decimal places with zero. This is usually good enough to generate strings that will round-trip back to the original floating point value, but it’s not great if you want the exact value (or the closest decimal representation thereof). The new formatting algorithms generate as many digits as are required to represent the value (or to fill the specified precision). As an example of the improvement; consider the results when printing a large power of two:

        printf("%.0f\n", pow(2.0, 80))
        Old:  1208925819614629200000000
        New:  1208925819614629174706176

    The old parsing algorithms would consider only up to 17 significant digits from the input string and would discard the rest of the digits. This is sufficient to generate a very close approximation of the value represented by the string, and the result is usually very close to the correctly rounded result. The new implementation considers all present digits and produces the correctly rounded result for all inputs (up to 768 digits in length). In addition, these functions now respect the rounding mode (controllable via fesetround).

  • Hexadecimal and Infinity/NaN Floating Point Parsing: The floating point parsing algorithms will now parse hexadecimal floating point strings (such as those generated by the %a and %A printf format specifiers) and all infinity and NaN strings that are generated by the printf functions, as described above.

  • snprintf and vsnprintf Are Now Implemented: The C99 snprintf and vsnprintf functions have been implemented.

  • Format String Validation: In previous versions, the printf and scanf functions would silently accept many invalid format strings, sometimes with unusual effects. For example, %hlhlhld would be treated as %d. All invalid format strings are now treated as invalid parameters.

  • fopen Mode String Validation: In previous versions, the fopen family of functions silently accepted some invalid mode strings (e.g. r+b+). Invalid mode strings are now detected and treated as invalid parameters (Connect #792703).

  • fseek Use with Large Files: In previous versions, the fseek function was unable to seek to positions more than INT_MAX bytes from the beginning of a file. This has been fixed, but note that if you are working with large files, you should use the 64-bit I/O functions like _fseeki64. The fseek function can still only seek up to INT_MAX bytes forward at a time, as its offset parameter is of type int (Connect #810715).

  • tmpnam Generates Usable File Names: In previous versions, the tmpnam and tmpnam_s functions generated file names in the root of the drive (e.g. \sd3c.). These functions now generate usable file name paths in a temporary directory.

  • FILE Encapsulation: In previous versions, the FILE type was completely defined in <stdio.h>, so it was possible for user code to reach into a FILE and muck with its internals. We have refactored the stdio library to improve encapsulation of the library implementation details. As part of this, FILE as defined in <stdio.h> is now an opaque type and its members are inaccessible from outside of the CRT itself.

  • WEOF: The WEOF macro was improperly parenthesized, so expressions involving WEOF (e.g. sizeof WEOF) would not compile. This has been fixed (Connect #806655).

  • Unusable Port I/O Functions are Removed: Six functions have been removed from the CRT: _inp, _inpw, _inpd, _outp, _outpw, and _outpd. These functions were used for reading from and writing to I/O ports on x86; because they used privileged instructions, they have never worked in user-mode code on Windows NT-based operating systems.

  • Standard File Descriptor and Stream Initialization:  The initialization of the standard file descriptors and streams has been fixed for non-console apps.  In non-console programs, the file handles are initialized to -2 (Connect #785119).

<stdlib.h>, <malloc.h>, and <sys/stat.h>

  • strtod Et Al.: The strtod family of functions would return an incorrect end pointer via the out parameter if the number at the beginning of the input string was composed of more than 232-1 characters. This has been fixed.

  • strtof and wcstof: The strtof and wcstof functions failed to set errno to ERANGE when the value was not representable as a float. This has been fixed. (Note that this error was specific to these two functions; the strtod, wcstod, strtold, and wcstold functions were unaffected.)

  • _stat Functions: In previous versions, the _stat functions might read one character past the end of the path string. This has been fixed (Connect #796796).

  • Aligned Allocation Functions: In previous versions, the aligned allocation functions (_aligned_malloc, _aligned_offset_malloc, etc.) would silently accept requests for a block with an alignment of 0. The documentation requires that the requested alignment be a power of two, which zero is not. This has been fixed, and a requested alignment of 0 is now treated as an invalid parameter (Connect #809604).

  • The _heapadd, _heapset, and _heapused functions have been removed. These functions have been nonfunctional since the CRT was updated to use the Windows heap.

  • The smalheap link option has been removed.

<time.h>

  • clock: In previous versions, the clock function was implemented using the Windows API GetSystemTimeAsFileTime. With this implementation, the clock function was sensitive to the system time, and was thus not necessarily monotonic. The clock function has been reimplemented in terms of QueryPerformanceCounter and is now monotonic.

    Several customers have noted that as specified by C, the clock function should return the “processor time used” by the process, not the wall clock time elapsed since the process was started. We continue to implement clock as returning wall clock time elapsed, as there is a substantial amount of software written for Windows that expects this behavior.

  • fstat and _utime: In previous versions, the _stat, fstat, and _utime functions handle daylight savings time incorrectly. Prior to Visual Studio 2013, all of these functions had a subtle daylight savings time bug: during daylight time, they incorrectly adjusted standard time times as if they were in daylight time. It appears that this went unnoticed for many years because while the implementations were incorrect, they were all consistently incorrect.

    In Visual Studio 2013, the bug in the _stat family of functions was fixed, but the similar bugs in the fstat and _utime families of functions were not fixed. This exposed the issue in those functions, because they started handling daylight savings time differently from the _stat functions. The fstat and _utime families of functions have now been fixed, so all of these functions now handle daylight savings time correctly and consistently (Connect #811534).

  • asctime: In previous versions, the asctime function would pad single-digit days with a leading zero, e.g. Fri Jun 06 08:00:00 2014. The specification requires that such days be padded with a leading space, e.g. Fri Jun _6 08:00:00 2014 (I’ve used an underscore the mark the padding space). This has been fixed.

  • time and ftime: The time and ftime functions will now use the GetSystemTimePreciseAsFileTime when that API is available (Windows 8 and above) for improved precision.

  • strftime and wcsftime: The strftime and wcsftime functions now support the %C, %D, %e, %F, %g, %G, %h, %n, %r, %R, %t, %T, %u, and %V format specifiers. Additionally, the E and O modifiers are parsed but ignored.

    The %c format specifier is specified as producing an “appropriate date and time representation” for the current locale. In the C locale, this representation is required to be the same as %a %b %e %T %Y. This is the same form as is produced by asctime. In previous versions, the %c format specifier incorrectly formatted times using a MM/DD/YY HH:MM:SS representation. This has been fixed.

  • C11 timespec and timespec_get: <time.h> now defines the C11 timespec type and the timespec_get function. In addition, the TIME_UTC macro, for use with the timespec_get function, is now defined.

  • CLOCKS_PER_SEC: The CLOCKS_PER_SEC macro now expands to an integer of type clock_t, as required by C.

operator new T[N]

  • In previous versions, operator new T[N] would fail to call constructors for elements in array if N was greater than 232-1. This has been fixed (Connect #802400).
James McNellis (james.mcnellis@microsoft.com)
Senior Software Development Engineer, Visual C++ Libraries
Category
C++

0 comments

Discussion are closed.