C# 11 Preview Updates – Raw string literals, UTF-8 and more!

Kathleen Dollard

Features for C# 11 are coming along nicely! You can check these features out by downloading Visual Studio 17.2 Preview 3 or .NET 7 Preview 3 for other editors. You can find more about C# 11 features that appeared earlier in What’s new in C# 11 and Early peek at C# 11 features and you can follow the progress of C# 11 on the Feature Status page. You can find out about other .NET 7 Preview 3 features in this .NET Blog post and more about Visual Studio 17.2 in the release notes.

We evolve C# to improve your development productivity, the resiliency of your applications in production, performance and support for new features. The C# team works on both the performance of your application in production, and how the compiler performance affects your development. Features in this post include:

This post also explains why we removed parameter null-checking from C# 11 and are adding a warning for lowercase type names.

Raw string literals

If you work with strings literal that contain quotes or embedded language strings like JSON, XML, HTML, SQL, Regex and others, raw literal strings may be your favorite feature of C# 11. Previously if you copied a literal string with quotes into a C# literal, the string ended at the first double quote with compiler errors until you escaped each one. Similarly, if you copied text with curly braces into an interpolated string literal, each curly bracket was interpreted as the beginning of a nested code expression unless you escape it, generally by doubling the curly bracket.

Raw string literals have no escaping. For example, a backslash is output as a backslash, and \t is output as the backslash and a t, not as the tab character.

Raw string literals start and end with at least three double quotes ("""..."""). Within these double quotes, single " are considered content and included in the string. Any number of double quotes less than the number that opened the raw string literal are treated as content. So, in the common case of three double quotes opening the raw string literals, two double quotes appearing together would just be content. If you need to output a sequence of three or more double quotes, just open and close the raw string literal with at least one more quote than that sequence.

Raw string literals can be interpolated by preceding them with a $. The number of $ that prefixes the string is the number of curly brackets that are required to indicate a nested code expression. This means that a $ behaves like the existing string interpolation – a single set of curly brackets indicate nested code. If a raw string literal is prefixed with $$, a single curly bracket is treated as content and it takes two curly brackets to indicate nested code. Just like with quotes, you can add more $ to allow more curly brackets to be treated as content. For example:

JSON example of raw string literal

Raw string literals also have new behavior around automatically determining indentation of the content based on leading whitespace. To learn more about this and to see more examples on this feature, check out the docs article Raw String Literals.

This feature will make it much easier to work with literals that contain certain characters. You can now copy code into or from a literal string without being hindered by adding or removing escape sequences.

Special thanks to jnm2 (Joseph Musser) for his work on the design and implementation of raw string literals.

UTF-8 String Literals

UTF-8 is used in many scenarios, particularly in web scenarios. Prior to C# 11, programmers had to either translate UTF-8 into hexadecimal – which led to verbose, unreadable, error prone code – or encode string literals at runtime.

C# 11 allows converting string literals containing only UTF-8 characters to their byte representation. This is done at compile-time, so the bytes are ready to use without additional runtime cost. So you can write code like:

byte[] array = "hello";             // new byte[] { 0x68, 0x65, 0x6c, 0x6c, 0x6f }
Span<byte> span = "dog";            // new byte[] { 0x64, 0x6f, 0x67 }
ReadOnlySpan<byte> span = "cat";    // new byte[] { 0x63, 0x61, 0x74 }

There are ongoing discussions about details such as whether a type suffix is required and what natural type that would imply. If you expect to use UTF-8 string literals, we would really like your feedback and you can see the UTF-8 String Literal proposal and the links contained in it for more information.

This feature brings a welcome simplification to everyone currently building byte arrays to represent UTF-8. If you are doing this, you will probably want to convert your code to use it after C# 11 releases. If you are not using UTF-8 string literals you can ignore this feature. For ASP.NET users, your response is encoding to UTF-8 from strings automatically, so you can ignore this feature.

Checked user-defined operators

One of the major motivations for the static abstract members in interfaces feature of C# 11 is the ability to support generic math. .NET developers can write algorithms that rely on interfaces that include static abstract members as the generic constraint. One such interface is INumber<TSelf> which provides access to APIs such as Max, Min, Parse, and even operators such as +, -, *, and /, as well as user defined conversions.

User-defined operators respect the arithmetic overflow and underflow checking context of the calling code, controlled via the <CheckForOverflowUnderflow> project property or the checked/unchecked regions and operators. Check out the language reference for about checked and unchecked behavior for arithmetic operators. Prior to C# 11, a user-defined operator was unaware of the context in which it was used.

C# 11 adds the ability to declare certain operators as checked, identified with the checked modifier. Operators that do not have this modifier will be unchecked when paired with a checked operator. The compiler will select the right operator to use based on the context of the calling code. The operators that can support checked versions are the ++, -- and - unary operators and the +, -, *, and / binary operators.

The distinction between checked and unchecked is the context in which they are used. There is no requirement that checked operators throw if the bounds of the type are exceeded or that unchecked operators not throw, but this is the behavior users expect. For example, for integer types MAX_VALUE+1 is MIN_VALUE in the unchecked context and throws an exception in the checked context. Some types, such as floating point numbers, do not overflow and therefore do not need separate checked and unchecked operators.

This feature is important to developers creating user-defined operators that operate on types where arithmetic overflow is a valid concept. It will allow new user-defined operators to respect the context in which the operator is used. We anticipate that only a small number of developers will use this feature directly, but the impact of their implementations will make the entire ecosystem more reliable and predictable.

Auto-default structs

Note: This feature is planned for 17.3, not 17.2. It was mistakenly included in this post. i am not removing it to avoid confusion about our intention regarding this feature. Look for it in a future preview!

In C# 10 and earlier, you needed to initialize all fields of a struct by initializing fields and auto-properties or setting them in the constructors. This can be awkward, particularly with the expected introduction of the field keyword and semi-auto properties in a later C# 11 preview. If you did not set these values, you received a compiler error. If we have sufficient information to provide the error, perhaps we should just set these values to default for you!

Starting with this preview, the compiler does exactly that. It initializes any fields and auto-properties that are not set based on definite assignment rules, and assigns the default value to them. If you do not want this behavior, there is a warning you can turn on.

This feature simplifies initialization for anyone using structs that include explicit constructors. This is likely to feel like the way structs with initializers always should have worked, and so you may take advantage of this feature without even thinking about it. If you are explicitly initializing fields to their default value in response to the previous compiler errors, you can remove that code.

Pattern matching with spans

Starting with this preview, you can pattern match a Span<char> or a ReadonlySpan<char> with a string literal. This code now works:

static bool IsABC(Span<char> s)
{
    return s switch { 
        "ABC" => true, 
        _ => false };
}

The input type must be statically known to be a Span<char> or a ReadonlySpan<char>. Also, the compiler reports an error if you match a Span<char> or a ReadonlySpan<char> to a null constant.

This feature will allow Span<char> or ReadonlySpan<char> to participate as patterns in switch statements and switch expressions for matching string literals. If you are not using Span<char> and ReadonlySpan<char> you can ignore this feature.

Special thanks to YairHalberstadt for implementing this feature.

Use a cached delegate for method group conversion

This feature will improve runtime performance by caching static method groups, rather than creating fresh delegate instances. This is to improve your application’s performance, particularly for ASP.NET. You will get the benefit of this feature with no effort on your part.

Special thanks to pawchen for implementing this feature

Remove parameter null-checking from C# 11

We previewed parameter null-checking as early as possible because we anticipated feedback. This feature allows !! on the end of a parameter name to provide parameter null checking before the method begins execution. We included this feature early in C# 11 to maximize feedback, which we gathered from GitHub comments, MVPs, social media, a conference audience, individual conversations with users, and the C# design team’s ongoing reflection. We received a wide range of feedback on this feature, and we appreciate all of it.

The feedback and the wide range of insight we gained from this feedback led us to reconsider this as a C# 11 feature. We do not have sufficient confidence that this is the right feature design for C# and are removing it from C# 11. We may return to this area again at a later date.

While there are several valid ways you can do null check on a single line, if you are using .NET 6 we recommend using ArgumentNullException.ThrowIfNull method:

public static void M(string myString)
{
    ArgumentNullException.ThrowIfNull(myString);
    // method 
}

One of the benefits of using the ThrowIfNull method is it uses CallerArgumentExpression to include the parameter name in the exception message automatically:

System.ArgumentNullException: 'Value cannot be null. (Parameter 'myString')'

Warning wave: Warnings on lowercase type names

C# 11 introduces a Warning Wave 7 that includes a warning for any type that is declared with all lowercase ASCII characters. This has been a common stylistic rule in the C# ecosystem for years. We are making it a warning because C# needs to occasionally introduce new keywords in order to evolve. These keywords will be lowercase and may conflict with your type’s name, if it is lowercase. We introduced this warning so you can avoid a possible future breaking change.

You can find out more about this change at Warning on lowercase type names in C# 11. Warning waves allow new warnings in C# in a manner that allows you to delay adoption if the warning causes issues you cannot currently resolve.

This warning is expected to affect very few people. But if you encounter it, we recommend updating your type name, or prefixing usages of it with @, such as @lower.

Closing

Please download Visual Studio 17.2 Preview 3 or .NET 7 Preview 3, try out the new features, and tell us what you think in the Discussions section of the CSharpLang repo.

45 comments

Discussion is closed. Login to edit/delete existing comments.

  • Álvaro García 0

    Unfortunately in my first attempt I tried to output “Hello world!” and it didn’t work because C# assumed four double quotes. 😀
    But, except for my bad first candidate, raw string literals works nicely.

  • Markus Hjärne 0

    Thanks for listening to us ordinary everyday users that loves C# regarding the null parameter checking feature. Didn’t mind the feature, but the syntax. Like many of the other features. Keep up the good work of evolving the language!

  • Vincent Thorn 0

    three double quotes

    facepalm Unbelievable clowns!! Guys, are you seriously make this idiocity in compiler?? Are you aware that it’s most stupid and inconvenient solution?

    • Karl von Laudermann 0

      Tripled double-quotes are already used to indicate verbatim strings in other languages, such as Python, Kotlin, Julia, and even F#. So there’s an established precedent for this.

  • Payton Byrd 0

    I would recommend whitelisting potential new keywords and popping a warning when they are used as an identifier. Any keyword in an existing Von Nueman language should be in the list.

    • anonymous 0

      this comment has been deleted.

  • André Slupik 0

    Still hoping C# will one day have discriminated unions and primary constructors in classes.

  • Someone 0

    Finally interesting update and big thanks for removing this ugly null check !! “feature”.

  • hitesh davey 0

    Hi Kathleen,

    The raw string literal is a very useful addition to the language!

    But the majority of the application deals with databases, json & xml, and the project source code has many SQL queries embedded as string literals.

    So apart from just considering anything between this $$”””…”””; as plain text. I would suggest taking this idea to the next level for specific use cases by prefixing some sort of token keywords like sql, json, xml so that we can add some meaning to the string literal, and VS can help us show intelligence in writing the raw string literals when the cursor is within the scope of the raw string literal. for example…

    var myQuery = sql$$”””
    select * from employee_master
    where age is between 30 and 40 and gender ='{emp.Gender}’
    “””;

    So when I am inside the SQL raw string literal scope, VS can help us to recognize complex SQL syntax by Syntax highlighting, autocompletion, and referencing all other db-specific coding assistance.

    Currently, all the SQL queries string looks like a normal string which is not much of a help visually as well as during the development stage unless we check the SQL syntax by first writing it on SQL editor and copy-paste the same in source code. This is tedious. This new raw string literal enhancements and VS can help us in minimizing this back and forth exercise.

    Likewise u can add
    json””””…””” (for json script)
    xml”””…””” (for xml script)

    • Markus Hjärne 0

      This have also been discussed at the LDM.

      • hitesh davey 0

        Oh, I see, I didnt know abt this. Thanks for the LDM reference link!

  • Eli Black 0

    It seems that

    ArgumentNullException.ThrowIfNull()

    isn’t yet available as a quickfix action in Visual Studio, so I’ve filed a bug for it: https://github.com/dotnet/roslyn/issues/61181 🙂

  • Eric Lynch 0

    Any chance of having ArgumentNullException.ThrowIfNull (and any similar methods added later) return the parameter that they check? This would allow some space savings when initializing properties or local variables. For example…

    MyProperty = ArgumentNullException.ThrowIfNull(myVariable);

    versus

    ArgumentNullException.ThrowIfNull(myVariable);
    MyProperty = myVariable.

    Adding a return value won’t hurt the folks that ignore it, but it will help in the above case and also help with chaining multiple checks. For example, if similar methods are added to other exceptions…

    MyProperty = ArgumentException.ThrowIfEmpty(ArgumentNullException.ThrowIfNull(myVariable));

  • Austin Wagner 0

    Using the VS 17.3 preview 1.1 I’m seeing an issue with overload resolution that seems to be due to the new UTF-8 string literals. We have an overloaded function with these signatures:

    void Log(string message, params object[] args);
    void Log(byte[] data, int offset, int length);

    If we have a call to Log with a string literal and 2 ints, we get a compiler error stating that the call is ambiguous. This issue makes using these logging functions less ergonomic since we need to be more explicit about the parameters.

    logger.Log("i1: {0}, i2: {1}", new object[] { i1, i2 });

    And it’s easy to make a mistake doing that because

    logger.Log("i1: {0}, i2: {1}", new[] { i1, i2 });

    actually means

    logger.Log("i1: {0}, i2: {1}", new object[] { new int[] { i1, i2 } });

    Personally I think it would make sense to have a project-wide setting that determines if UTF-8 string literals are implicit and if that’s turned off then require a u8 suffix to make it explicit. A complementary u16 suffix may also make sense to have available as an override when implicit UTF-8 string literals are enabled.

Feedback usabilla icon