This post is the second in a series of posts covering various refactoring scenarios that explore C# 12 features. In this post, we’ll look at how you can refactor your code using collection expressions, we’ll learn about collection initializers, various expression usages, supported collection target types, and the spread syntax. Here’s how the series is shaping up:
- Refactor your C# code with primary constructors
- Refactor your C# code with collection expressions (this post)
- Refactor your C# code by aliasing any type
- Refactor your C# code to use default lambda parameters
These features continue our journey to make our code more readable and maintainable, and these are considered “Everyday C#” features that developers should know.
Collection Expressions 🎨
C# 12 introduced collection expressions that offer a simple and consistent syntax across many different collection types. When initializing a collection with a collection expression, the compiler generates code that is functionally equivalent to using a collection initializer. The feature emphasizes consistency, while allowing for the compiler to optimize the lowered C#. Of course, every team decides what new features to adopt, and you can experiment and introduce this new syntax if you like it, since all of the previous ways to initialize collections will continue to work.
With collections expressions elements appear inlined sequences of elements between an opening [
and closing ]
bracket. Read on to hear more about how collection expressions work.
Initialization 🌱
C# provides many syntaxes for initializing different collections. Collection expressions replace all of these, so let’s start with a look at different ways you can initialize an array of integers like this:
var numbers1 = new int[3] { 1, 2, 3 };
var numbers2 = new int[] { 1, 2, 3 };
var numbers3 = new[] { 1, 2, 3 };
int[] numbers4 = { 1, 2, 3 };
All four versions are functionally equivalent, and the compiler generates identical code for each version. The last example is similar to the new collection expressions syntax. If you squint your eyes a bit, you could imagine the curly braces as {
and }
as square brackets [
and ]
, then you’d be reading the new collection expression syntax. Collection expressions don’t use curly braces. This is to avoid ambiguity with existing syntax, especially { }
to indicate any not-null in patterns.
The last example is the only to declare the type explicitly, instead of relying on var
. The following example creates a List<char>
:
List<char> david = [ 'D', 'a', 'v', 'i', 'd' ];
Again, collection expressions cannot be used with the var
keyword. You must declare the type because a collection expression doesn’t currently have a natural type and can be converted to a wide variety of collection types. Supporting assignment to var
is still under consideration, but the team has not settled on the what the natural type should be. In other words, the C# compiler errors out with CS9176: There is no target type for the collection expression, when writing the following code:
// Error CS9176: There is no target type for the collection expression
var collection = [1, 2, 3];
You might be asking yourself, “with all these different approaches to initializing collections, why would I use the new collection expression syntax?” The answer is that with collection expressions, you can use the same syntax to express collections in a consistent way. This can help to make your code more readable and maintainable. We’ll explore more advantages in the coming sections.
Collection expression variations 🎭
You can express that a collection is empty, using the following syntax:
int[] emptyCollection = [];
The empty collection expression initialization is a great replacement for code that was otherwise using the new
keyword, as it’s optimized by the compiler to avoid allocating memory for some collection types. For example, when the collection type is an array T[]
, the compiler generates an Array.Empty<T>()
, which is more efficient than new int[] { }
. Another shortcut is to use the number of elements in the collection expression to set the collection size, such as new List<int>(2)
for List<T> x = [1, 2];
.
Collection expressions also allow you to assign to interfaces without stating an explicit type. The compiler determines the type to use for types, such as IEnumerable<T>
, IReadOnlyList<T>
, and IReadOnlyCollection<T>
. If the actual type used is important, you’ll want to state it because this may change if more efficient types become available. Likewise, in situations where the compiler cannot generate more efficient code, for example when the collection type is a List<T>
, the compiler generates a new List<int>()
, which is then equivalent.
The advantages of using the empty collection expression are threefold:
- It provides a consistent means of initializing all collections, regardless of their target type.
- It allows the compiler to generate efficient code.
- It’s less code to write. For example, instead of writing
Array.Empty<T>()
orEnumerable.Empty<T>()
, you can simply write[]
.
A few more details about the efficient generated code: using the []
syntax generates known IL. This allows the runtime to optimize by reusing the storage for Array.Empty<T>
(for each T
), or even more aggressively inline the code.
Empty collections serve their purpose, but you may need a collection that has some initial values. You can initialize a collection with a single element, using the following syntax:
string[] singleElementCollection =
[
"one value in a collection"
];
Initializing a single element collection is similar to initializing a collection with more than a single element. You can initialize a collection with multiple elements by adding other literal values, using the following syntax:
int[] multipleElementCollection = [1, 2, 3 /* any number of elements */];
A bit of history
Early proposals of the feature included the phrase “collection literals”—and you’ve probably heard that term in relation to this feature. Which seems obvious and logical, especially considering the previous few examples. All of the elements were expressed as literal values. But you’re not limited to using literals. In fact, you can just as easily initialize a collection with variables, so long as the types correspond (and when they do not, there’s an implicit conversion available).Let’s look at another code sample, but this uses spread element, to include the elements of another collection, using the following syntax:
int[] oneTwoThree = [1, 2, 3];
int[] fourFiveSix = [4, 5, 6];
int[] all = [.. fourFiveSix, 100, .. oneTwoThree];
Console.WriteLine(string.Join(", ", all));
Console.WriteLine($"Length: {all.Length}");
// Outputs:
// 4, 5, 6, 100, 1, 2, 3
// Length: 7
The spread element is a powerful feature that allows you to include the elements of another collection in the current collection. The spread element is a great way to combine collections in a concise way. The expression in a spread element must be enumerable (foreach
-able). For more information, see the Spread ✨ section.
Supported collection types 🎯
There are many target types that collection expressions can be used with. The feature recognizes the “shape” of a type that represents a collection. Therefore, most collections you’re familiar with are supported out-of-the-box. For types that don’t match that “shape” (mostly readonly collections), there are attributes you can apply to describe the builder pattern. The collection types in the BCL that needed the attributes/builder pattern approaches, have already been updated.
It’s unlikely that you’ll ever need to think about how target types are selected, but if you are curious about the rules see the C# Language Reference: Collection expressions—conversions.
Collection expressions don’t yet support dictionaries. You can find a proposal to extend the feature C# Feature Proposal: Dictionary expressions.
Refactoring scenarios 🛠️
Collection expressions can be useful in many scenarios, such as:
- Initializing empty collections that declare non-nullable collection types:
- fields.
- properties.
- local variables.
- method parameters.
- return values.
- a coalescing expression as the final fallthrough to safely avoid exceptions.
- Passing arguments to methods that expect collection type parameters.
Let’s use this section to explore some sample usage scenarios, and consider potential refactoring opportunities. When you define a class
or struct
that contains fields and/or properties with non-nullable collection types, you can initialize them with collection expressions. For example, consider the following example ResultRegistry
object:
namespace Collection.Expressions;
public sealed class ResultRegistry
{
private readonly HashSet<Result> _results = new HashSet<Result>();
public Guid RegisterResult(Result result)
{
_ = _results.Add(result);
return result.Id;
}
public void RemoveFromRegistry(Guid id)
{
_ = _results.RemoveWhere(x => x.Id == id);
}
}
public record class Result(
bool IsSuccess,
string? ErrorMessage)
{
public Guid Id { get; } = Guid.NewGuid();
}
In the preceding code, the result registry class contains a private _results
field that is initialized with a new HashSet<Result>()
constructor expression. In your IDE of choice (that supports these refactoring features), right-click on the new
keyword, select Quick Actions and Refactorings...
(or press Ctrl + .), and choose Collection initialization can be simplified
, as shown in the following video:
The code is updated to use the collection expression syntax, as shown in the following code:
private readonly HashSet<Result> _results = [];
The previous code, instantiated the HashSet<Result>
with the new HashSet<Result>()
constructor expression. However, in this case []
is identical.
Spread ✨
Many popular programming languages such as Python and JavaScript/TypeScript, among others provide their variation of the spread syntax, which serves as a succinct way to work with collections. In C#, the spread element is the syntax used to express the concatenation of various collections into a single collection.
Proper terminology
The spread element is often confused with the term “spread operator”. In C#, there’s no such thing as a “spread operator”. The..
expression isn’t an operator, it’s an expression that’s part of the spread element syntax. By definition, this syntax doesn’t align with that of an operator, as it doesn’t perform an operation on its operands. For example, the ..
expression already exists with the slice pattern for ranges and it’s also found in list patterns.So what exactly is spread element? It takes the individual values from the collection being “spread” and places them in the destination collection at that position. The spread element functionality also comes with a refactoring opportunity. If you have code that calls .ToList
or .ToArray
, or you want to use eager evaluation, your IDE might be suggesting to use the spread element syntax instead. For example, consider the following code:
namespace Collection.Expressions;
public static class StringExtensions
{
public static List<Query> QueryStringToList(this string queryString)
{
List<Query> queryList = (
from queryPart in queryString.Split('&')
let keyValue = queryPart.Split('=')
where keyValue.Length is 2
select new Query(keyValue[0], keyValue[1])
)
.ToList();
return queryList;
}
}
public record class Query(string Name, string Value);
The preceding code could be refactored to use the spread element syntax, consider the following code that removes the .ToList
method call, and uses an expression-bodied method as a bonus refactored version:
public static class StringExtensions
{
public static List<Query> QueryStringToList(this string queryString) =>
[
.. from queryPart in queryString.Split('&')
let keyValue = queryPart.Split('=')
where keyValue.Length is 2
select new Query(keyValue[0], keyValue[1])
];
}
Span<T>
and ReadOnlySpan<T>
support 📏
Collection expressions support Span<T>
and ReadOnlySpan<T>
types that are used to represent a contiguous region of arbitrary memory. You benefit from the performance improvements they offer, even if you don’t use them directly in your code. Collection expressions allow the runtime to offer optimizations, especially where overloads using span can be selected when collection expressions are used as arguments.
You can also assign directly to span, if your application uses spans:
Span<int> numbers = [1, 2, 3, 4, 5];
ReadOnlySpan<char> name = ['D', 'a', 'v', 'i', 'd'];
If you’re using the stackalloc
keyword, there’s even a provided refactoring to use collection expressions. For example, consider the following code:
namespace Collection.Expressions;
internal class Spans
{
public void Example()
{
ReadOnlySpan<byte> span = stackalloc byte[10]
{
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
};
UseBuffer(span);
}
private static void UseBuffer(ReadOnlySpan<byte> span)
{
// TODO:
// Use the span...
throw new NotImplementedException();
}
}
If you right-click on the stackalloc
keyword, select Quick Actions and Refactorings...
(or press Ctrl + .), and choose Collection initialization can be simplified
, as shown in the following video:
The code is updated to use the collection expression syntax, as shown in the following code:
namespace Collection.Expressions;
internal class Spans
{
public void Example()
{
ReadOnlySpan<byte> span =
[
1, 2, 3, 4, 5, 6, 7, 8, 9, 10
];
UseBuffer(span);
}
// Omitted for brevity...
}
For more information, see Memory<T>
and Span<T>
usage guidelines.
Semantic considerations ⚙️
When initializing a collection with a collection expression, the compiler generates code that is functionally equivalent to using a collection initializer. Sometimes the generated code is much more efficient than using a collection initializer. Consider the following example:
List<int> someList = new() { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
The rules for a collection initializer require that the compiler call the Add
method for each element in the initializer. However, if you’re to use the collection expression syntax:
List<int> someList = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
The compiler generates code that instead uses AddRange
, that might be faster or better optimized. The compiler is able to make these optimizations because it knows the target type of the collection expression.
Next steps 🚀
Be sure to try this out in your own code! Check back soon for the next post in this series, where we’ll explore how to refactor your C# code by aliasing any type. In the meantime, you can learn more about collection expressions in the following resources:
You should really split the style options for this. I think your query example is terrible, having ToList() is much more clear. One more thing that drives me mad is that the analyzers insist on replacing ICollection customers = new HashSet(); with [] which results in completely different behavior. The use of HashSet to assign to an ICollection is a common pattern in EF and the analyzer screams at it demanding to change the behavior...
This is beautiful, expressive, and performant. Thank you all.
I have a question on
Spread
.I agree
..
is not really an operator, just noticed Collection expressions (official documentation) is referring..
as spread operator.It’s very important that we maintain correct terminology across.
Hi Jaliya, yes, you’re correct that we need to update the docs. Good catch. Thank you
This feature should come with a big warning: HERE BE DRAGONS
You don’t know if the compiler team has chosen the optimal code generation strategy or if the generated code is a fallback to just work.
Best example is that on release this did generate the worst possible code for efcore DbSet .ToList(). I don’t know if this is already fixed.
Regarding the LINQ example with the spread operator: does use of the collection expression result in the same call to or is there some additional optimization going on?
The original code could have been written as expression-bodied member as well, without declaring a local variable, so the new syntax isn't much shorter and arguably the intent isn't really clearer either. I wouldn't use collection expressions in that case, unless perhaps it's more efficient –...
I don’t know who decided to have the analyzers suggest the use of the morse code syntax instead of the much more clear .ToList(). I’m about to turn off the style rule to suggest collection initializers because of a couple of rules like that. At least split them in separate settings
Love or hate the collection expressions, but the use of emojis here is world class.
IMHO this is the annoying part of the feature
Again, collection expressions cannot be used with the var keyword.
hope this will be addressed sooner then later, and if that’s all the reason behind it
but the team has not settled on the what the natural type should be.
roll a dice if you can’t make up your mind. Or put up a poll and let majority decide.
I can’t think of a worse idea for software development than having the masses “vote” on how a feature should work. Especially a feature as bad as the var keyword. Support in this case would just result in lazy programmers declaring collection types whose implementation they aren’t even aware of.
If anyone is wondering, the question is which is the correct choice when var is used:
a) Default to the fastest and most efficient, such as Span or ReadOnlySpan
b) Default to the one with the best developer experience, such as List
You can argue both cases, so it is just hard for everyone to reach agreement here.
I disagree. var support is not needed.
This is a neat implementation of the Collection expression.
This is all extremely nice!