Conversation about the .NET type system
The .NET or Common Language Runtime (CLR) type system is the foundation of the .NET programming model. We often talk about
System.Object being the base of the type system, but it’s really the base of all (reference) types. The type system is (at least) one step lower than that. It defines that both reference and value types exist, that strings are immutable, that single-inheritence is allowed and multiple-inheritence is not, and that generics are a runtime concept. On the other hand, it doesn’t define concepts like
Task<T>, but it enables those
Ts to exist. You can quickly see that the type system is one of the most pervasive concepts in .NET.
We’re using the conversation format again, this time with runtime engineers who work on the type system and related topics.
What is the purpose of a type system and why do they vary so much across languages?
David: The purpose of a type system is to define and control the way in which data and code are arranged within an application. They vary so much across languages as there are many different ways to arrange code and data that make sense, and are useful.
Jared: The type system is the building blocks and rules on which the language features are built. They vary so much across languages because language differentiation is often achieved by starting from a different set of building blocks and rules on which it operates.
What are the aspects of the CLR type system that you think are the most apparent to .NET developers?
David: I see two different aspects of the type system as most apparent to .NET developers.
- Code and data are held in constructs like value types and classes, which encourages developers to build object oriented systems.
- The CLR type system’s smallest redistributable unit is the assembly, or .dll. This has profound implications on how applications are structured, and developed.
Jared: In general I think the split between
class is the aspect that becomes the most apparent to .NET developers. It’s essentially two ways of defining very similar objects with similar capabilities but the “modifier” on the type significantly changes how the type is used by consumers. It’s not common in other languages and runtimes
The line between CLR and C# concepts can be murky. How do you think of the new record types? They are a class that behaves more like a struct. Is that good?
David: “Is that good” probably isn’t the right question here. A better question would more like “Is that useful?” And I believe they are useful, as it makes it simple for one developer to communicate to others that some objects aren’t an object with behaviors, but more just pure data. Also, the new record types are much less effort to use than building the equivalent logic would have been in past versions of C#.
One thing to remember about the line between CLR and C# concepts is that CLR concepts provide the possibility to make some logic work, and C# concepts provide an interface for actual developers to work with. The C# concepts are an opinionated view on the possible programs that can be written using CLR concepts, and over time, the developers of the C# language have found ways for programmers to more clearly and succinctly represent intent on a fairly regular cadence, while the fundamental capabilities provided by CLR concepts are typically much more slow to evolve.
Jared: I often divide types into whether they primarily provide data or behavior. In the case they provide data I often want a number of features to come along with it: immutability, equality, deconstruction, etc … Essentially I want the data objects to fit into all of the C# features that allow me to explore data. Records are a declarative syntax for letting me define data objects that get all of these features for free.
Having classes that behave like values has always been possible in C# and there are many types in the framework that already do this. Generally though these classes fall into the category of “data” style objects,
Tuple<> for example. It’s not good or bad to do this, it’s instead an exercise in evaluating trade offs: heap vs. stack, cost of passing / returning, etc …
In the case of records we wanted to explore classes first because that is what most of the customers who valued records were already using. In future versions of the language we will allow for them to be declared as structs as well though to help customers who need to make different trade off decisions.
Value types and structs. Same thing, right?
David: In how I work with the type systems, yes. Value type is the CLR term for what is exposed in C# as a struct.
Jared: Yes they’re the same thing. Except in the case of ValueType which is a fairly special value type. It’s the base type of all value types even though value types can’t inherit from other types.
How about structs vs classes?
Jared: When discussing the difference between struct and class most people tend to focus on how structs default to allocated on the stack and classes are allocated on the heap. I tend to think about them a bit differently. A struct is in many ways a loose grouping of fields while a class is a firm container around a group of fields. When you assign structs together it essentially comes down to a field by field assignment whereas assigning classes together is always a single pointer assignment. Understanding this gives you a better sense of the trade offs between the two types: whether or not assignments are atomic, how they are laid out in memory, what level of control the type has over it’s contents, etc …
There have been several low-level type system changes across CLR and C# in recent years, like ref structs and
Span<T>, ref returns, covariant return types, default interface methods, and static interface methods. On the one hand, those are great because they are targeted at performance and other needs. On the other, they are not something that most developers will use. When is the next type system feature coming that the average developer uses?
Jared: I think it will be static virtual interface methods. That is a feature we will be previewing in C# 10 that allows customers to use static methods on type parameters inside generic methods. On the face this may seem like an advanced concept which wouldn’t have broad user reach, however it opens the door to us defining generic math methods. Essentially allowing us to express mathemetical algorithms in terms of any numeric type vs. today where we have to limit to a specific type like
int, etc … This capability is present and popular in a number of other languages and I think we will see similar usages in .NET once the feature is available.
David: I don’t know. One possibility is the static virtual interface methods feature we are working on in preview. Another possibility that I have been experimenting with is a form of specialized generic code, but we haven’t seen much need for that in our community yet.
Some of these new features press the bounds of safety. Is that OK?
Jared: They press the bounds of safety but in a way that doesn’t push the burden to customers. The rules around
Span<T> are quite involved and took many months to refine, verify and make workable with common coding patterns. The burden here though was primarily on the .NET team to stretch the boundaries here and see what we could achieve.
The result is the customer can consume
Span<T> and get the performance without worrying about the safety issues: the language simply prevents you from doing unsafe operations as it also does for other features. The customer needs to learn a bit about the new rules but they don’t have to worry about the safety.
David: Yes. The CLR and .NET ecosystem have always embraced code following a spectrum of safety. We strive to make normal code safe, and potentially unsafe code is something that generally has to be opted in using some mechanism, (P/Invoke, Unsafe APIs, Marshal APIs, unsafe code blocks, etc.) In the spectrum of type system provided safety rules .NET/C# exists in a fairly pragmatic place where unsafe code is possible, but we strive to make it difficult to accidentally invoke potentially unsafe behavior.
.NET is often compared to Java, with the biggest differences being value types and reified/runtime generics. Why are those two type system features valuable? Would you repeat those choices if you could re-design .NET?
David: These two type system features are valuable as they allow the way a program is executed to be expressed in code instead of merely expressing the semantic meaning of the code. This is an interesting philosophical design difference between the .NET and Java type systems, where the Java type system less often expresses details on exactly how program semantics should be implemented. This has certain benefits and drawbacks, but I believe I would likely repeat those decisions if I could re-design .NET although I would probably change a few details.
Jared: The advantage of value types is avoiding heap allocations and by extension reducing the pressure on the garbage collector. That is an invaluable tool for high performance code which often needs to ensure a GC will not happen on a given code path. I would definitely want to keep them if we were starting .NET from scratch but I would likely invest in changing how they were expressed. Rather than expressing them at the declaration, essentially having a differentiation between classes and structs, I’d explore if we could express it at usage time. Essentially have a single kind of type and at the use case decide between heap or stack allocation. That is a really difficult problem to solve without pushing a lot of complextiy to the customer. Enough that I think such a redesign may end up where we are today because it’s a very pragmatic trade off.
The type system is both the result of intentional design decisions made by the original architects of .NET and the outcome of years of organic change based on the needs of .NET users. You can likely see that the type system will continue to evolve as new scenerios and requirements present themselves. Only time will tell what those new type system capabilities will be.
Thanks again to Jared and David for sharing their insights on the type system.
C# has a great type system that empowers developers, like when reflecting on their types, through metadata.
Having everything being an “objects” with methods, including primitive types, was a great thing that simplified development.
And also Runtime generics. These were the thing that I missed when having to write Java in college 12 years ago.
Personally, I have been thinking about these two (which I know will never be implemented):
Yes, I think these changes would have been helpful in some circumstances. Regarding nullable, I think C# largely agrees with you that not separating null for reference types in the type system was probably a mistake (Tony Hoare’s Billion Dollar Mistake, in fact). This is particularly hard to retrofit now as type changes at the CLR level tend to involve representation changes, and that means that changing types in a signature produces breaking changes for existing binary references. That’s the biggest impediment to this change for the future.
System.Void being usable as a type parameter may be more tractable, but once again the proliferation of
Actionmeans that the damage has already been done, in a sense.
I think nullability made way to complex for compatibility sake
I wish MS could have another strict “mode” like kotlin
F# has effectively System.Void but calls it “Unit” (because it is a type with a single possible value). It works just as you suggest and after using it C# feels like it missed a trick there, leading to having Task and Task of T etc.
I’m surprised this conversation didn’t mention F# which uses the same CLR type system too but provides other type system expressions on it such as discriminated unions (algebraic datatypes) and had record (essential for immutable data) from the beginning.
Eiffel had embedded classes. You could turn any reference type into a value type on the fly.
Nice. That would make Jared happy!
What Jared wants sounds like c++?
Who wrote these questions?
First of all, it should be “There have been…”. Second, where is this ridiculous assertion that the average developer can’t use these features coming from? Covariant return types have been a major community request since C# 1.0. There are tons of questions on Stack Overflow asking for it (and SO is only 10ish years old).
I’ve written all of these “conversation” questions. Thanks for the plural/singular grammar catch.
The point is that the many of the type system features are niche and in some cases hard to use. Clearly, they were all built for a reason and have a target audience. Sure, some of these features may have broader adoption than I expect. I doubt ref structs, for example, are showing up in many code bases. I’ve used them and can say with confidence that they are difficult to use.
Just because you mention the ref structs. Because I’m doing a lot of performance measurements. Are there any good benchmarks that show the real benefit?
I could be wrong, but I don’t see ref structs as have any specific performance benefit over regular structs. However, there is significant benefit for methods and types that can be fully
Span<T>oriented (not having to resort to using
Memory<T>. That’s what ref structs are for. I see them as an extension of the
Span<T>programming model. Here’s an extension of that which I asked for: https://github.com/dotnet/csharplang/discussions/2582.
That makes sense. Thanks.
Yes, the type system is really good with a few mistakes made in history (the void as explained above). Also metdata and reflection capabilities are really good. What I’m missing though is a deeper documentation of this topic. There is pretty much from the older days, but for example for the MetadataLoadContext you can barely find any good documentation. There’s Type, RuntimeTime, RuntimeTypeHandle, TypeHandle, TypeInfo, TypeCode, etc… it would be really be good to have a good documentation and their dependencies/relationships.
Additionally what is currently really missing is math/operators on generics. Shapes and extensions are in a pretty long discussion phase (I understand – it’s complicated) but having this would beautify a lot of code. The workarounds that need to be done are really awful and it’s really too much typing to get such an easy thing like A + B from the developer point of view – of course I understand what needs to be done in the runtime (but please do it ;-).
The comment from both David and Jared on static virtual interface methods is intended for the math functionality you are asking for. Here’s an issue on that, with links to others. https://github.com/dotnet/runtime/issues/50129
Thanks Rich. Looking at the list a lot of things have already been done so I hope it will come soon 😉
I’m not a TS user. Are there a couple concepts that you have in mind?
Definitely union and intersection types. Also type aliases (though I think global usings is equivalent), and discriminated unions is another.
Inline type definitions
Creating altered versions of types is so easy you can do it inline:
Here, without an explicit class definition, I have defined the return type as an AttachedDoc class with an extra “comment” field. This is called an “intersection”.
Another great example is LINQ’s join statement. When joining two tables, instead of having to specify TResult of the resulting object and having to map the values into that object, you could have another method override whose output would be an object of type TInner & TOuter, i.e. an object that returns all the fields from these two lists/tables implicitly. This would save a ton of time.
Literal types and Template literal types
Literal types are inline replacement for Enums, i.e.:
Template literal types, along with ts type arithmetic, allow for some pretty nifty type definitions. For example, defining a type safe, nested, property access for objects using strings, like in this example (Take a look at the autocomplete for the second argument at the last line of the example)
My feeling is nobody in MS can CLEARLY(!) explain why they introduce “records”. And this is “bad sign” you barely know why you do it.
We have (reference) classes. We have (value) structs. Does anybody feel tied in these types?? Let’s talk about THAT.
I read a whole article and found nothing helpful, despite I work on .NET since 2002. Hey, be more useful at conversation – it’s not a beer party!
Here are two (partial) answers about records from the post that speak to me:
And to add a bit more. You could say “well just use structs”. Well, structs don’t have the same experience as records (syntax or behavior). That’s why the team is looking at creating struct records as well. And last, classes are easier to use than structs, which is why the team started with classes. I love being able to define a (data-oriented) class with one line. Anything more than that is just boiler plate.
Records allow you to easily create simple “data holder” types (especially read-only ones). C# 9 supported reference types, C# 10 will support value types as well.
It’s that simple — data holders are common and fundamental to programming and it was laborious to create them before.
That’s a much simpler answer than I gave. I like it and agree.
C# type system is different from CTS – Common Type System – used in CLR
F# type system is different from CTS and different from C# type system
What is .NET Type system?
I meant “CLR type system” as called out in the first sentence. This is the same as the “Common Type System” you called out. We don’t use the CTS term.
Nice overview, it is interesting to see how .NET goes, catching up with what could have been version 1.0, if the learnings of Modula-3, Eiffel, Delphi were taken into account, producing a more friendly version of “managed” C++.
I guess better be happy that this path is being taken now instead of never.
Which in a way also relates to Java, while the ongoing comparisasions still give the edge to .NET as per Java 16, they are also trying to retify the mistakes of version 1.0, further ahead than .NET in what concerns AOT story (specially now that .NET Native is abandonend), and eventually Valhalla project will be merged, thus also having value types.
In a way, it is interesting that D, Nim, Zig, Rust, Swift, Go even with their smaller market presence are pushing the way for what .NET 1.0 and Java 1.0 should have been all along.
There is indeed a long history of programming languages to learn and be inspired from.
I understand this sentiment, but it isn’t really fair or reflective of the reality of shipping a v1.0 platform. I’m only addressing this because it comes up a fair bit. If I think back to the 1.0 release, it barely got out the door. I wasn’t on the team, but was at MSFT and saw the CDs (or were they DVDs?) that were available on campus for .NET Framework 1.0. I think the team got to something like RC7 before finally shipping. I heard that the release slipped a year from the original schedule. If I was to go back and give that team guidance, I would have told them to ship MUCH less. Also, there were some wizard-like ideas in the product were very unproven and should have waited for more market testing. Code access security comes to find. In contrast, Go was released w/o generics and there was a big long discussion about adding them afterwards. I think there was something similar on exceptions or some other diagnostic model. I tip my hat to the Go team for taking a hard line on scope.
I have a question about the hard-line on language backwards compatibility. If there was a provably equivalent procedure of converting, let’s say, a C# 1.0 program into a future breaking lang version, e.g. C# 11, would the team ever consider introducing such a breaking change to fix some of these foundational issues?
In the past, .NET 4.0 even ran separate runtimes side-by-side for binary compatibility with .NET 2.0. Seems like this could have been a great chance to fix some of these issues. Was this ever considered at any point?
Unlikely. There have been some minor breaking changes over the years. I don’t have any examples, but that’s what the team tells me. Certainly, if there was a high-value change that had an algorithmic path from source A to source B, that could be interesting. It almost never works out that way.
Do you have any specific changes you’d like to see?
this comment has been deleted.
I get that, to make it more explicit for me, given the background of .NET architects I was expecting something that was AOT compiled, similar to C++ Builder, Delphi, VB 6 in capabilties, where the class system would just be sugar for COM, basically what .NET Native tried out to be, or from what is public information, System C# in Midori.
For a while it seemed that it was strange to some Microsoft teams that we wanted a kind of managed C++ and not something to draw data entry forms.
Things did not happened that way, so I do appreciate that these features are now being acknowledged and sorted out into the platform.
I really hope that C# could have sum type (which is called discriminated union in F# and TypeScript). It’s an extremely underrated language feature. Check the blogs and the book of Scott Wlaschin from F# community.
Also, with sum types, we can build Result<T, E> to handle expected error, which makes applying this ASP.NET guideline much easier.
The biggest problem I think in CTS is delegate. Delegate was designed for events. They are not very suitable for functional programming. It’s behaviors have causing problems for a long time:
– It’s multicast by default, but stops when any handler throws. A system graceful to exceptions must extract each singlecast.
– It allocates. This is requiring preallocated s_XXXDelegate instances.
– It requires explicit definition, unlike function pointers. And since spans can’t be used as generic parameters, a new type SpanAction must be introduced.
The benefits are typically applicable to every reference type comparing to value type.
I want there to be a “new delegate type” which is effectively ValueTuple. It could be more beneficial than ValueTask comparing to Task.