New C# Source Generator Samples

Luca Bolognese

Luca

Phillip introduced C# Source Generators here. This post describes two new generators that we added to the samples project in the Roslyn SDK github repo.

The first generator gives you strongly typed access to CSV data. The second one creates string constants based on Mustache specifications.

Source Generators Overview

It is important to have a good mental picture of how source generators operate. Conceptually, a generator is a function that takes some input (more on that later) and generates C# code as output. This ‘function’ runs before the code for the main project is compiled. In fact, its output becomes part of the project.

The inputs to a generator must be available at compile time, because that’s when generators run. In this post we explore two different ways to provide it.

You use a generator in your project by either referencing a generator project or by referencing the generator assembly directly. In the samples project this is achieved by the following instruction in the project file:

<ItemGroup>
    <ProjectReference Include="..\SourceGeneratorSamples\SourceGeneratorSamples.csproj"
                            OutputItemType="Analyzer" ReferenceOutputAssembly="false" />
</ItemGroup>

CSV Generator Usage

The CSV Generator takes as an input CSV files and returns strongly typed C# representations of them as output. You specify the CSV files with the following lines in the project file:

<ItemGroup>
    <AdditionalFiles Include="People.csv" CsvLoadType="Startup" />
    <AdditionalFiles Include="Cars.csv" CsvLoadType="OnDemand" CacheObjects="true" />
</ItemGroup>

Where the People.csv file looks like so:

Name, address, 11Age
"Luca Bol", "23 Bell Street", 90
"john doe", "32 Carl street", 45

There are two additional arguments that get passed as part of the input in the project file AdditionalFiles tag: CsvLoadType and CacheObjects. CsvLoadType can take the value of Startup or OnDemand: the former instruct the code to load the objects representing the CSV file when the program starts; the latter loads them at first usage. CacheObjects is a bool indicating if the objects need to be cached after creation.

It can be a little confusing to keep straight when exactly every phase runs. The generation of classes representing the shape of the CSV file happens at compile time, while the creation of the objects for each row of the file happens at run time according to the policy specified by CsvLoadType and CacheObjects.

BTW: the 11Age column name came about as a way to test that the C# generation is correct in case of columns starting with a number.

Given such input, the generator creates a CSV namespace that you can import in your code with:

using CSV;

In the namespace there is one class for each CSV file. Each class contains an All static property that can be used like so:

WriteLine("## CARS");
Cars.All.ToList().ForEach(c => WriteLine($"{c.Brand}\t{c.Model}\t{c.Year}\t{c.Cc}"));
WriteLine("\n## PEOPLE");
People.All.ToList().ForEach(p => WriteLine($"{p.Name}\t{p.Address}\t{p._11Age}"));

So that’s how you use the generator. Let’s now look at how it is implemented.

CSV Generator Implementation

Inside the generator project you need a class implementing the ISourceGenerator interface with a Generator attribute.

[Generator]
public class CSVGenerator : ISourceGenerator

The Execute method is the entry point. It gets called by the compiler to start the generation process. Ours looks like this:

public void Execute(SourceGeneratorContext context)
{
    IEnumerable<(CsvLoadType, bool, AdditionalText)> options = GetLoadOptions(context);
    IEnumerable<(string, string)> nameCodeSequence = SourceFilesFromAdditionalFiles(options);
    foreach ((string name, string code) in nameCodeSequence)
        context.AddSource($"Csv_{name}", SourceText.From(code, Encoding.UTF8));
}

We first get the options – CsvLoadType and CacheObjects from the project file – we then generate the source files by reading the additional files and add them to the project.

Getting the options is just a few easy calls to the analyzer apis:

static IEnumerable<(CsvLoadType, bool, AdditionalText)> GetLoadOptions(SourceGeneratorContext context)
{
    foreach (AdditionalText file in context.AdditionalFiles)
    {
        if (Path.GetExtension(file.Path).Equals(".csv", StringComparison.OrdinalIgnoreCase))
        {
            // are there any options for it?
            context.AnalyzerConfigOptions.GetOptions(file)
                .TryGetValue("build_metadata.additionalfiles.CsvLoadType", out string? loadTimeString);
            Enum.TryParse(loadTimeString, ignoreCase: true, out CsvLoadType loadType);

            context.AnalyzerConfigOptions.GetOptions(file)
                .TryGetValue("build_metadata.additionalfiles.CacheObjects", out string? cacheObjectsString);
            bool.TryParse(cacheObjectsString, out bool cacheObjects);

            yield return (loadType, cacheObjects, file);
        }
    }
}

Once the options are retrieved, the process of generating C# source files to represent the CSV data can start.

static IEnumerable<(string, string)> SourceFilesFromAdditionalFile(CsvLoadType loadTime,
    bool cacheObjects, AdditionalText file)
{
    string className = Path.GetFileNameWithoutExtension(file.Path);
    string csvText = file.GetText()!.ToString();
    return new (string, string)[] { (className, GenerateClassFile(className, csvText, loadTime, cacheObjects)) };
}

static IEnumerable<(string, string)> SourceFilesFromAdditionalFiles(IEnumerable<(CsvLoadType loadTime,
    bool cacheObjects, AdditionalText file)> pathsData)
    => pathsData.SelectMany(d => SourceFilesFromAdditionalFile(d.loadTime, d.cacheObjects, d.file));

We iterate over all the CSV files and generate a class file for each one of them by calling GenerateClassFile. This is where the magic happens: we look at the csv content and we generate the correct class file to add to the project.

But this is a long function (code), so let’s just look at the start and the end of it to get the flavour.

public static string GenerateClassFile(string className, string csvText, CsvLoadType loadTime,
    bool cacheObjects)
{
    StringBuilder sb = new StringBuilder();
    using CsvTextFieldParser parser = new CsvTextFieldParser(new StringReader(csvText));

    //// Usings
    sb.Append(@"
#nullable enable
namespace CSV {
using System.Collections.Generic;

");
    //// Class Definition
    sb.Append($"    public class {className} {{\n");

First we add a new class to the CSV namespace. The name of the class corresponds to the CSV file name. Then we generate the code for the class and return it.

    // CODE TO GENERATE C# FROM THE CSV FILE ...

    sb.Append("            }\n        }\n    }\n}\n");
    return sb.ToString();
}

In the end, the compiler adds to our project a file called Csv_People.cs containing the code below.

#nullable enable
namespace CSV {
    using System.Collections.Generic;

    public class People {

        static People() { var x = All; }
        public string Name { get; set;} = default!;
        public string Address { get; set;} = default!;
        public int _11Age { get; set;} = default!;

        static IEnumerable<People>? _all = null;

        public static IEnumerable<People> All {
            get {

                List<People> l = new List<People>();
                People c;
                c = new People();
                c.Name = "Luca Bol";
                c.Address = "23 Bell Street";
                c._11Age =  90;
                l.Add(c);
                c = new People();
                c.Name = "john doe";
                c.Address = "32 Carl street";
                c._11Age =  45;
                l.Add(c);
                _all = l;
                return l;
            }
        }
    }
}

This is what gets compiled into your project, so that you can reference it from the code.

Mustache Generator Usage

For the Mustage Generator, we use a different way to pass input arguments compared to the CSV Generator above. We embed our input in assembly attributes and then, in the generator code, we fish them out of the assembly to drive the generation process.

In our client code, we pass the inputs to the generator as below:

using Mustache;

[assembly: Mustache("Lottery", t1, h1)]
[assembly: Mustache("HR", t2, h2)]
[assembly: Mustache("HTML", t3, h3)]
[assembly: Mustache("Section", t4, h4)]
[assembly: Mustache("NestedSection", t5, h5)]

The first argument to the Mustache attribute is the name of a static property that gets generated in the Mustache.Constants class.

The second argument represents the mustache template to use. In the demo we use the templates from the manual. For example:

public const string t1 = @"
Hello {{name}}
You have just won {{value}} dollars!
{{#in_ca}}
Well, {{taxed_value}} dollars, after taxes.
{{/in_ca}}
";

The third argument is the hash to use with the template.

public const string h1 = @"
{
""name"": ""Chris"",
""value"": 10000,
""taxed_value"": 6000,
""in_ca"": true
}
";

Each attribute instance is a named pair (template, hash). Our generator uses it to generate a string constant that you can access like this:

WriteLine(Mustache.Constants.Lottery);

The resulting output is good for Chris, as expected:

Hello Chris
You have just won 10000 dollars!
Well, 6000.0 dollars, after taxes.

Mustache Generator Implementation

The input to this generator is quite different from the previous one, but the implementation is similar. Or at least it has a familiar ‘shape’. As before there is a class implementing ISourceGenerator with an Execute method:

[Generator]
public class MustacheGenerator : ISourceGenerator
{
    public void Execute(SourceGeneratorContext context)
    {
        string attributeSource = @"
[System.AttributeUsage(System.AttributeTargets.Assembly, AllowMultiple=true)]
internal sealed class MustacheAttribute: System.Attribute
{
    public string Name { get; }
    public string Template { get; }
    public string Hash { get; }
    public MustacheAttribute(string name, string template, string hash)
        => (Name, Template, Hash) = (name, template, hash);
}
";
        context.AddSource("Mustache_MainAttributes__", SourceText.From(attributeSource, Encoding.UTF8));

First we need to add a source file to the project to define the Mustache attribute that will be used by the clients to specify the inputs.

Then we inspect the assembly to fish out all the usages of the Mustache attribute.

        Compilation compilation = context.Compilation;

        IEnumerable<(string, string, string)> options = GetMustacheOptions(compilation);

The code to do so is in the GetMustacheOptions function, that you can inspect here.

Once you have the options, it is time to generate the source files:

static string SourceFileFromMustachePath(string name, string template, string hash)
{
    Func<object, string> tree = HandlebarsDotNet.Handlebars.Compile(template);
    object @object = Newtonsoft.Json.JsonConvert.DeserializeObject(hash);
    string mustacheText = tree(@object);

    return GenerateMustacheClass(name, mustacheText);
}

First we use Handlebars.net to create the string constant text (first 3 lines above). We then move on to the task of generating the property to contain it.

private static string GenerateMustacheClass(string className, string mustacheText)
{
    StringBuilder sb = new StringBuilder();
    sb.Append($@"
namespace Mustache {{

public static partial class Constants {{

public const string {className} = @""{mustacheText.Replace("\"", "\"\"")}"";
}}
}}
");
    return sb.ToString();

}

That was easy, mainly thanks to C# partial classes. We generate a single class from multiple source files.

Conclusion

C# Source Generators are a great addition to the compiler. The ability to interject yourself in the middle of the compilation process and have access to the source tree, makes it possible, even simple, to enable all sorts of scenarios (i.e. domain languages, code interpolation, automatic optimizations …). We look forward to you surprising us with your own Source Generators!

14 comments

Comments are closed. Login to edit/delete your existing comments

  • Andy Baker
    Andy Baker

    Isn’t code generation an admission of failure? If the language isn’t expressive enough or your API isn’t usable enough then code generation can get you through a tough time but it’s a band aid over language or API flaws. It’s never something to be proud of.

    • Avatar
      JesperTreetop

      That’s an over-generalization. It, like some “design patterns”, can easily be the echo of a missing feature. I agree with the basic gist since I would love to have macros in C# tomorrow; in their absence, it’s good to have the ability to do some of the same things with code generation. There are already ways to generate code, but to have a specified place in the compilation pipeline where it happens, and to have the compiler and IDE know all about it gives it much better robustness, not to mention being able to more easily distribute code generators as sidecar packages to, for example, serialization libraries. Microsoft would probably never adopt something like PostSharp into the framework; this, they might use.

      In addition, you can also do it to precompute stuff that already is doable at runtime but is “needlessly dynamic”, as in the examples above, where the alternative is to dynamically grab things that could have been burned in at compile-time. Reflection is another example, where much of the time you don’t want to go through every single type in every assembly at runtime, but just want to figure out which things are connected. Doing it once during compilation and then working off of that is great. (Even if you want the dynamic part of reflection for eg plugins, you could save time by precomputing the known quantities and doing the rest dynamically.) From what I understand, the DI architecture in .NET Core is a prime example for this that’s being looked at, and would cut down on startup costs.

  • Avatar
    MgSam

    I really really wish you guys had thought about re-using T4 templates before making this completely incompatible (and in many ways inferior) implementation of a code generator. The main benefit of source generators over T4 should be that it gives you access to the syntax tree of the code applying the generator, which isn’t even a use case you cover in this article at all.

    It is inferior to T4 because:
    – No template support – the output is stored in a string with no syntax highlighting.
    – No runtime analog (which T4 has)

    Also, do these have to generate C#? The output is just text, after all.

    • Avatar
      Mark Pflug

      The main benefit of source generators over T4 should be that it gives you access to the syntax tree of the code applying the generator, which isn’t even a use case you cover in this article at all.

      Access to the syntax tree is the benefit; if you don’t need the syntax tree for the code you are generating then this might not be the right tool. However, in those cases where it is needed it enables very powerful things to be done that aren’t possible (or much more difficult) to do without source generators. I wholly agree that these examples don’t do justice to the source generator capability. Both of these things could be done with MSBuild-based build-time code generation; including providing design time intellisense for the generated code.

  • Avatar
    JinShil

    This is a feature of C# I’ve wanted for more than a decade. I’m very pleased to see it finally coming.

    However, the examples are very uncompelling. Let me know when you make a generator that does auto-implementation of interfaces. I’m thinking something like this:

    class TestClass : [AutoImpl(typeof(TestInterface1Impl))] IInterface1, [AutoImpl(typeof(TestInterface2Impl))] IInterface2
    {
        // The following code is added by the generator
        readonly TestInterface1Impl _impl1;
        readonly TestInterface2Impl _impl2;
    
        public void Method1()
        {
            _impl1.Method1();
        }
    
        public void Method2()
        {
             _impl2.Method2();
        }
    }
    • Avatar
      JinShil

      And a few more interesting things that might be possible to implement with source generators.

      • static if
      • static for/foreach (think compile time loop unroolling, or compile-time method and property generation)
      • templated classes instead of generic classes

      I’m not sure how that can be accomplished. The syntax may look weird given that source generation is separate from the language itself, but I’m sure there’s a way to make it work.

      In fact, just look at the D programming language for inspiration. Their metaprogramming facilities are the best I’ve ever experienced.

      Eventually I’ll get around to implementing some of these things myself, but it’d be nice if Microsoft had a few generators like this out of the box.

  • Avatar
    Bal Singh

    In the code it says
            “”taxed_value””: 5000,
    and with the code generator
            Well, {{taxed_value}} dollars, after taxes.
    And the output being
            Well, 6000.0 dollars, after taxes.
    This certainly has got me tearing my mustache out!

  • Avatar
    Adrian Strangfeld

    Dear Microsoft-Team,

    what will be the minimum requirements to be able to add source code generators to a project?
    Would I be able to include them in older projects that still use .net 4.5?

    Thank you

    • Avatar
      Chris SienkiewiczMicrosoft employee

      While Source Generators in preview they require the consumer opting into their use by setting the language version of the project to ‘preview’. We intend to remove that restriction when generators hit V1. As long as the compiler you’re using supports source generators, you will be able to use them in your project. However, it is the responsibility of the generator author to ensure the code they generate is compatible with your particular project choices.

  • Avatar
    Dmitriy Krasnikov

    So how do I debug my generators? I am not talking debugging generated code, but the generators themselves. Anyting I tried so far doesn’t hit breakpoints