July 12th, 2022

4 reactions

Announcing Entity Framework Core 7 Preview 6: Performance Edition

Shay Rojansky

Entity Framework 7 (EF7) Preview 6 has shipped and is available on nuget.org. Keep reading for links to individual packages. This blog post will focus on optimizations to update performance; for the full list of EF7 Preview 6 enhancements, see this page.

Update performance improvements

In EF7, SaveChanges performance has been significantly improved, with a special focus on removing unneeded network roundtrips to your database. In some scenarios, we’re seeing a 74% reduction in time taken – that’s a four-fold improvement!

Background

Performance is always high on our priorities in EF Core. For EF Core 6.0, we concentrated on improving the performance of non-tracking queries, achieving a very significant speedup and making EF Core comparable to raw SQL queries using Dapper (see this blog post for the details). For EF Core 7.0, we targeted EF Core’s “update pipeline”: that’s the component that implements SaveChanges, and is responsible for applying inserts, updates and deletions to your database.

The query optimizations in EF Core 6.0 were essentially about runtime performance: the goal was to reduce EF Core’s direct overhead, i.e. the time spent within EF Core code when executing a query. The update pipeline improvements in EF Core 7.0 are quite different; it turned out that there were opportunities for improvement in the SQL which EF sends to the database, and even more importantly, in the number of network roundtrips which occur under the hood when SaveChanges is invoked. Optimizing network roundtrips is particularly important for modern application performance:

Network latency is typically a significant factor (sometimes measured in milliseconds), so eliminating an unneeded roundtrip can be far more impactful than many micro-optimizations in the code itself.
Latency also varies based on various factors, so eliminating a roundtrip has an increasing effect the higher the latency.
In traditional on-premises deployment the database server is typically located close to the application servers. In the cloud environment the database server tends to be farther away, increasing latency.

Regardless of the performance optimization described below, I highly recommend keeping roundtrips in mind when interacting with a database, and reading the EF performance docs for some tips (for example, prefer loading rows eagerly whenever possible).

Transactions and roundtrips

Let’s examine a very trivial EF program that inserts a single row into the database:

var blog = new Blog { Name = "MyBlog" };
ctx.Blogs.Add(blog);
await ctx.SaveChangesAsync();

Running this with EF Core 6.0 shows the following log messages (filtered to highlight the important stuff):

dbug: 2022-07-10 17:10:48.450 RelationalEventId.TransactionStarted[20200] (Microsoft.EntityFrameworkCore.Database.Transaction)
      Began transaction with isolation level 'ReadCommitted'.
info: 2022-07-10 17:10:48.521 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (30ms) [Parameters=[@p0='Foo' (Size = 4000)], CommandType='Text', CommandTimeout='30']
      SET NOCOUNT ON;
      INSERT INTO [Blogs] ([Name])
      VALUES (@p0);
      SELECT [Id]
      FROM [Blogs]
      WHERE @@ROWCOUNT = 1 AND [Id] = scope_identity();
dbug: 2022-07-10 17:10:48.549 RelationalEventId.TransactionCommitted[20202] (Microsoft.EntityFrameworkCore.Database.Transaction)
      Committed transaction.

The main command – which took 30 milliseconds – contains two SQL statements (ignoring the NOCOUNT which isn’t relevant): the expected INSERT statement, followed by a SELECT to fetch the ID for the new row we just inserted. In EF Core, when your entity’s key is an int, EF will usually set it up to be database-generated by default; for SQL Server, this means an IDENTITY column. Since you may want to continue doing further operations after inserting that row, EF must fetch back the ID value and populate it in your blog instance.

So far, so good; but there’s more going on here: a transaction is started before the command is executed, and committed afterwards. Looking at this through my performance analysis spectacles, that transaction costs us two additional database roundtrips – one to start it, and another to commit. Now, the transaction is there for a reason: SaveChanges may need to apply multiple update operations, and we want those updates to be wrapped in transaction, so that if there’s a failure, everything is rolled back and the database is left in a consistent state. But what happens if there’s only one operation, like in the above case?

Well, it turns out that databases guarantee transactionality for (most) single SQL statements; if any error occurs, you don’t need to worry about the statement only partially completed. That’s great – that means that we can entirely remove the transaction when a single statement is involved. And sure enough, here’s what the same code produces with EF Core 7.0:

info: 2022-07-10 17:24:28.740 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (52ms) [Parameters=[@p0='Foo' (Size = 4000)], CommandType='Text', CommandTimeout='30']
      SET IMPLICIT_TRANSACTIONS OFF;
      SET NOCOUNT ON;
      INSERT INTO [Blogs] ([Name])
      OUTPUT INSERTED.[Id]
      VALUES (@p0);

Much shorter – the transaction is gone! Let’s see what this optimization is worth by benchmarking it with BenchmarkDotNet (you’re not still hand rolling your own benchmarks with Stopwatch, are you?).

Method	EF Version	Server	Mean	Error	StdDev
Insert_one_row	6.0	Localhost	522.9 us	5.76 us	5.10 us
Insert_one_row	7.0	Localhost	390.0 us	6.78 us	8.82 us

Nice, that’s a 133 microsecond improvement, or 25%! But since we’re discussing roundtrips, you have to ask: where’s the database, and what’s the latency to it? The figures above are for running against a SQL Server instance running on my local machine. That’s something you should generally never do when benchmarking: having the application and the database on the same machine can cause interference and skew your results; after all, you wouldn’t do that in production would you? But more importantly for us, the latency when contacting localhost is, well, very low – we’re looking at the lower bound for the possible improvement.

Let’s do another run against a remote machine. In this benchmark, I’ll be connecting from my laptop to my desktop, over a wifi connection. That’s also not quite realistic: wifi isn’t the best medium for this kind of thing, and just like you’re probably not running the database on the same machine in production, you’re probably not connecting to it over wifi, are you? We won’t discuss how closely this approximates a real-world connection to e.g. a cloud database – you can easily benchmark this yourself in your environment and find out. Here are the results:

Method	EF Version	Server	Mean	Error	StdDev
Insert_one_row	6.0	Remote	8.418 ms	0.1668 ms	0.4216 ms
Insert_one_row	7.0	Remote	4.593 ms	0.0913 ms	0.2531 ms

That’s quite a different ballpark: we’ve saved 3.8 milliseconds, or 45%. 3.8ms is already considered a significant amount of time in a responsive web application or API, so that’s a significant win.

Before we move on, you may have noticed other SQL changes above, besides the transaction elimination:

A new SET IMPLICIT_TRANSACTIONS OFF has appeared. SQL Server has an opt-in “implicit transactions” mode, where executing a statement outside of a transaction won’t auto-commit, but instead implicitly start a new transaction. We want to disable this to make sure that changes are actually saved. The overhead for this is negligible.
Instead of inserting and then selecting the database-generated ID, the new SQL uses an “OUTPUT clause” to tell SQL Server to send the value directly from the INSERT. Aside from being tighter SQL, this is needed to get the transactionality guarantees without needing the explicit transaction, as we discussed above. It so happens that the EF Core 6’s two statements are safe, since the last inserted identity value (scope_identity) is local to the connection, and the ID doesn’t change in EF, but there are various other cases where that wouldn’t hold true (e.g. if there were other database-generated values besides the ID).

Inserting multiple rows

Let’s see what happens if we insert multiple rows:

for (var i = 0; i < 4; i++)
{
    var blog = new Blog { Name = "Foo" + i };
    ctx.Blogs.Add(blog);
}
await ctx.SaveChangesAsync();

Running this with EF Core 6.0 shows the following in the logs:

dbug: 2022-07-10 18:46:39.583 RelationalEventId.TransactionStarted[20200] (Microsoft.EntityFrameworkCore.Database.Transaction)
      Began transaction with isolation level 'ReadCommitted'.
info: 2022-07-10 18:46:39.677 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (52ms) [Parameters=[@p0='Foo0' (Size = 4000), @p1='Foo1' (Size = 4000), @p2='Foo2' (Size = 4000), @p3='Foo3' (Size = 4000)], CommandType='Text', CommandTimeout
='30']
      SET NOCOUNT ON;
      DECLARE @inserted0 TABLE ([Id] int, [_Position] [int]);
      MERGE [Blogs] USING (
      VALUES (@p0, 0),
      (@p1, 1),
      (@p2, 2),
      (@p3, 3)) AS i ([Name], _Position) ON 1=0
      WHEN NOT MATCHED THEN
      INSERT ([Name])
      VALUES (i.[Name])
      OUTPUT INSERTED.[Id], i._Position
      INTO @inserted0;

      SELECT [i].[Id] FROM @inserted0 i
      ORDER BY [i].[_Position];
dbug: 2022-07-10 18:46:39.705 RelationalEventId.TransactionCommitted[20202] (Microsoft.EntityFrameworkCore.Database.Transaction)
      Committed transaction.

That’s a bit… unexpected (and not easy to understand). SQL Server has a MERGE statement, which was originally intended for merging together two tables, but can be used for other purposes. It turns out that using MERGE to insert four rows is significantly faster than 4 separate INSERT statements – even when batched. So the above does the following:

Create a temporary table (that’s the DECLARE @inserted0 bit).
Use MERGE to insert into four rows – based on parameters we send – into the table. An OUTPUT clause (remember that?) outputs the database-generated ID into the temporary table.
SELECT to retrieve the IDs from the temporary table.

As a side note, this kind of advanced, SQL Server-specific technique is a good example of how an ORM like EF Core can help you be more efficient than writing SQL yourself. Of course, you can use the above technique yourself without EF Core, but in reality, few users go this deep into optimization investigations; with EF Core you don’t even need to be aware of it.

Let’s compare that with the EF Core 7.0 output:

info: 2022-07-10 18:46:56.530 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (25ms) [Parameters=[@p0='Foo0' (Size = 4000), @p1='Foo1' (Size = 4000), @p2='Foo2' (Size = 4000), @p3='Foo3' (Size = 4000)], CommandType='Text', CommandTimeout
='30']
      SET IMPLICIT_TRANSACTIONS OFF;
      SET NOCOUNT ON;
      MERGE [Blogs] USING (
      VALUES (@p0, 0),
      (@p1, 1),
      (@p2, 2),
      (@p3, 3)) AS i ([Name], _Position) ON 1=0
      WHEN NOT MATCHED THEN
      INSERT ([Name])
      VALUES (i.[Name])
      OUTPUT INSERTED.[Id], i._Position;

The transaction is gone, as above – MERGE is also a single statement that’s protected by an implicit transaction. Note that if we used 4 INSERT statements instead, we would not be able to omit the explicit transaction (with its extra roundtrips); so that’s another advantage of using MERGE, compounded on the basic better performance it delivers here.

But other things have changed as well: the temporary table is gone, and the OUTPUT clause now sends the generated IDs directly back to the client. Let’s benchmark how these two variations perform:

Method	EF Version	Server	Mean	Error	StdDev
Insert_four_rows	6.0	Remote	12.93 ms	0.258 ms	0.651 ms
Insert_four_rows	7.0	Remote	4.985 ms	0.0981 ms	0.1981 ms
Insert_four_rows	6.0	Local	1.679 ms	0.0331 ms	0.0368 ms
Insert_four_rows	7.0	Local	435.8 us	7.85 us	6.96 us

The remote scenario runs almost 8 milliseconds faster, or a 61% improvement. The local scenario is even more impressive: the 1.243 millisecond improvement amounts to a 74% improvement; the operation is running four times as fast on EF Core 7.0!

Note that these results include two separate optimizations: the removal of the transaction discussed above, and the optimization of MERGE to not use a temporary table.

Interlude: SQL Server and the OUTPUT clause

At this point you may be wondering why it is that EF Core didn’t use a simple OUTPUT clause – without a temporary table – up to now. After all, the new SQL is both simpler and faster.

Unfortunately, SQL Server has some limitations which disallow the OUTPUT clause in certain scenarios. Most importantly, using the OUTPUT clause on a table that has a trigger defined is unsupported and raises an error (see the SQL Server docs); OUTPUT with INTO (as used above with MERGE by EF Core 6.0) is supported. Now, when we were first designing EF Core, the goal we had was for things to work across all scenarios, in order to make the user experience as seamless as possible; we were also unaware just how much overhead the temporary table actually added. Revisiting this for EF Core 7.0, we had the following options:

Retain the current slow behavior by default, and allow users to opt into the newer, more efficient technique.
Switch to the more efficient technique, and provide an opt out for people using triggers to switch to the slower behavior.

This isn’t an easy decision to make – we try hard to never break users if we can help it. However, given the extreme performance difference and the fact that users wouldn’t even be aware of the situation, we ended up going with option 2. Users with triggers who upgrade to EF Core 7.0 will get an informative exception that points them to the opt-out, and everyone else gets significantly improved performance without needing to know anything.

Even less roundtrips: principals and dependents

Let’s look at one more scenario. In this one, we’re going to insert a principal (Blog) and a dependent (Post):

ctx.Blogs.Add(new Blog
{
    Name = "MyBlog",
    Posts = new()
    {
        new Post { Title = "My first post" }
    }
});
await ctx.SaveChangesAsync();

This generates the following:

dbug: 2022-07-10 19:39:32.826 RelationalEventId.TransactionStarted[20200] (Microsoft.EntityFrameworkCore.Database.Transaction)
      Began transaction with isolation level 'ReadCommitted'.
info: 2022-07-10 19:39:32.890 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (22ms) [Parameters=[@p0='MyBlog' (Size = 4000)], CommandType='Text', CommandTimeout='30']
      SET NOCOUNT ON;
      INSERT INTO [Blogs] ([Name])
      VALUES (@p0);
      SELECT [Id]
      FROM [Blogs]
      WHERE @@ROWCOUNT = 1 AND [Id] = scope_identity();
info: 2022-07-10 19:39:32.929 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (3ms) [Parameters=[@p1='1', @p2='My first post' (Size = 4000)], CommandType='Text', CommandTimeout='30']
      SET NOCOUNT ON;
      INSERT INTO [Post] ([BlogId], [Title])
      VALUES (@p1, @p2);
      SELECT [Id]
      FROM [Post]
      WHERE @@ROWCOUNT = 1 AND [Id] = scope_identity();
dbug: 2022-07-10 19:39:32.932 RelationalEventId.TransactionCommitted[20202] (Microsoft.EntityFrameworkCore.Database.Transaction)
      Committed transaction.

We have four roundtrips: two for the transaction management, one for the Blog insertion, and one for the Post insertion (note that each DbCommand execution represents a roundtrip). Now, EF Core does generally do batching in SaveChanges, meaning that multiple changes are sent in a single command, for better efficiency. However, in this case that’s not possible: since the Blog’s key is a database-generated IDENTITY column, we must get the generated value back before we can send the Post insertion, which must contain it. This is a normal state of affairs, and there isn’t much we can do about it.

Let’s change our Blog and Post to use GUID keys instead of integers. By default, EF Core performs client generation on GUID keys, meaning that it generates a new GUID itself instead of having the database do it, as is the case with IDENTITY columns. With EF Core 6.0, we get the following:

dbug: 2022-07-10 19:47:51.176 RelationalEventId.TransactionStarted[20200] (Microsoft.EntityFrameworkCore.Database.Transaction)
      Began transaction with isolation level 'ReadCommitted'.
info: 2022-07-10 19:47:51.273 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (36ms) [Parameters=[@p0='7c63f6ac-a69a-4365-d1c5-08da629c4f43', @p1='MyBlog' (Size = 4000)], CommandType='Text', CommandTimeout='30']
      SET NOCOUNT ON;
      INSERT INTO [Blogs] ([Id], [Name])
      VALUES (@p0, @p1);
info: 2022-07-10 19:47:51.284 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (2ms) [Parameters=[@p2='d0e30140-0f33-4435-e165-08da629c4f4d', @p3='0', @p4='7c63f6ac-a69a-4365-d1c5-08da629c4f43' (Nullable = true), @p5='My first post' (Size
 = 4000)], CommandType='Text', CommandTimeout='30']
      SET NOCOUNT ON;
      INSERT INTO [Post] ([Id], [BlogId], [BlogId1], [Title])
      VALUES (@p2, @p3, @p4, @p5);
dbug: 2022-07-10 19:47:51.296 RelationalEventId.TransactionCommitted[20202] (Microsoft.EntityFrameworkCore.Database.Transaction)
      Committed transaction.

Unfortunately, the Blog and Post are still being inserted via different commands. EF Core 7.0 does away with this and does the following:

dbug: 2022-07-10 19:40:30.259 RelationalEventId.TransactionStarted[20200] (Microsoft.EntityFrameworkCore.Database.Transaction)
      Began transaction with isolation level 'ReadCommitted'.
info: 2022-07-10 19:40:30.293 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (26ms) [Parameters=[@p0='ce67f663-221a-4a86-3d5b-08da629b4875', @p1='MyBlog' (Size = 4000), @p2='127329d1-5c31-4001-c6a6-08da629b487b', @p3='0', @p4='ce67f663-
221a-4a86-3d5b-08da629b4875' (Nullable = true), @p5='My first post' (Size = 4000)], CommandType='Text', CommandTimeout='30']
      SET NOCOUNT ON;
      INSERT INTO [Blogs] ([Id], [Name])
      VALUES (@p0, @p1);
      INSERT INTO [Post] ([Id], [BlogId], [BlogId1], [Title])
      VALUES (@p2, @p3, @p4, @p5);
dbug: 2022-07-10 19:40:30.302 RelationalEventId.TransactionCommitted[20202] (Microsoft.EntityFrameworkCore.Database.Transaction)
      Committed transaction.

Since the Blog’s key is client-generated, it’s no longer necessary to wait for any database-generated values, and the two INSERTs are combined into a single command, reducing a roundtrip.

I know what you’re thinking – you’re now considering switching from auto-incrementing integer IDs to GUIDs, to take advantage of this optimization. Before you run off and do that, you should know that EF Core also has a feature called HiLo, which provides similar results with an integer key. When HiLo is configured, EF sets up a database sequence, and fetches a range of values from it (10 by default); these pre-fetched values are cached internally by EF Core, and used whenever a new row needs to be inserted. The effect is similar to the GUID scenario above: as long as we have remaining values from the sequence, we no longer need to fetch a database-generated ID when inserting. Once EF exhausts those values, it will do a single roundtrip to fetch the next range of values, and so on.

HiLo can be enabled on a property basis as follows:

protected override void OnModelCreating(ModelBuilder modelBuilder)
{
    modelBuilder.Entity<Blog>().Property(b => b.Id).UseHiLo();
}

Once that’s done, our SaveChanges output is efficient, and resembles the GUID scenario:

dbug: 2022-07-10 19:54:25.862 RelationalEventId.TransactionStarted[20200] (Microsoft.EntityFrameworkCore.Database.Transaction)
      Began transaction with isolation level 'ReadCommitted'.
info: 2022-07-10 19:54:25.890 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
      Executed DbCommand (20ms) [Parameters=[@p0='1', @p1='MyBlog' (Size = 4000), @p2='1', @p3='My first post' (Size = 4000)], CommandType='Text', CommandTimeout='30']
      SET NOCOUNT ON;
      INSERT INTO [Blogs] ([Id], [Name])
      VALUES (@p0, @p1);
      INSERT INTO [Post] ([BlogId], [Title])
      OUTPUT INSERTED.[Id]
      VALUES (@p2, @p3);
dbug: 2022-07-10 19:54:25.909 RelationalEventId.TransactionCommitted[20202] (Microsoft.EntityFrameworkCore.Database.Transaction)
      Committed transaction.

Note that this roundtrip-removing optimization speeds up some other scenarios as well, including the table-per-type (TPT) inheritance mapping strategy, cases where rows are deleted and inserted into the same table in a single SaveChanges call, and some others.

Closing words

In this blog post, we’ve gone over three optimizations in the EF Core 7.0 update pipeline:

Omit the transaction when only one statement is being executed via SaveChanges (reduction of two roundtrips).
Optimize SQL Server’s multiple row insertion technique to stop using a temporary variable.
Remove unneeded roundtrips related to insertion of a principal and dependent in the same SaveChanges call, and some other scenarios.

We believe these are impactful improvements, and hope that they’ll benefit your application. Please share your experience, good or bad!

Prerequisites

EF7 currently targets .NET 6.
EF7 will not run on .NET Framework.

EF7 is the successor to EF Core 6.0, not to be confused with EF6. If you are considering upgrading from EF6, please read our guide to port from EF6 to EF Core.

How to get EF7 previews

EF7 is distributed exclusively as a set of NuGet packages. For example, to add the SQL Server provider to your project, you can use the following command using the dotnet tool:

dotnet add package Microsoft.EntityFrameworkCore.SqlServer --version 7.0.0-preview.6.22329.4

This following table links to the preview 6 versions of the EF Core packages and describes what they are used for.

Package	Purpose
Microsoft.EntityFrameworkCore	The main EF Core package that is independent of specific database providers
Microsoft.EntityFrameworkCore.SqlServer	Database provider for Microsoft SQL Server and SQL Azure
Microsoft.EntityFrameworkCore.SqlServer.NetTopologySuite	SQL Server support for spatial types
Microsoft.EntityFrameworkCore.Sqlite	Database provider for SQLite that includes the native binary for the database engine
Microsoft.EntityFrameworkCore.Sqlite.Core	Database provider for SQLite without a packaged native binary
Microsoft.EntityFrameworkCore.Sqlite.NetTopologySuite	SQLite support for spatial types
Microsoft.EntityFrameworkCore.Cosmos	Database provider for Azure Cosmos DB
Microsoft.EntityFrameworkCore.InMemory	The in-memory database provider
Microsoft.EntityFrameworkCore.Tools	EF Core PowerShell commands for the Visual Studio Package Manager Console; use this to integrate tools like scaffolding and migrations with Visual Studio
Microsoft.EntityFrameworkCore.Design	Shared design-time components for EF Core tools
Microsoft.EntityFrameworkCore.Proxies	Lazy-loading and change-tracking proxies
Microsoft.EntityFrameworkCore.Abstractions	Decoupled EF Core abstractions; use this for features like extended data annotations defined by EF Core
Microsoft.EntityFrameworkCore.Relational	Shared EF Core components for relational database providers
Microsoft.EntityFrameworkCore.Analyzers	C# analyzers for EF Core

We also published the 7.0 preview 6 release of the Microsoft.Data.Sqlite.Core provider for ADO.NET.

Installing the EF7 Command Line Interface (CLI)

Before you can execute EF7 Core migration or scaffolding commands, you’ll have to install the CLI package as either a global or local tool.

To install the preview tool globally, install with:

dotnet tool install --global dotnet-ef --version 7.0.0-preview.6.22329.4

If you already have the tool installed, you can upgrade it with the following command:

dotnet tool update --global dotnet-ef --version 7.0.0-preview.6.22329.4

It’s possible to use this new version of the EF7 CLI with projects that use older versions of the EF Core runtime.

Daily builds

EF7 previews are aligned with .NET 7 previews. These previews tend to lag behind the latest work on EF7. Consider using the daily builds instead to get the most up-to-date EF7 features and bug fixes.

As with the previews, the daily builds require .NET 6.

The .NET Data Community Standup

The .NET data team is now live streaming every other Wednesday at 10am Pacific Time, 1pm Eastern Time, or 17:00 UTC. Join the stream to ask questions about the data-related topic of your choice, including the latest preview release.

Watch our YouTube playlist of previous shows
Visit the .NET Community Standup page to preview upcoming shows
Submit your ideas for a guest, product, demo, or other content to cover

Documentation and Feedback

The starting point for all EF Core documentation is docs.microsoft.com/ef/.

Please file issues found and any other feedback on the dotnet/efcore GitHub repo.

Helpful Links

The following links are provided for easy reference and access.

Thank you from the team

A big thank you from the EF team to everyone who has used and contributed to EF over the years!

Welcome to EF7.

Author

Shay Rojansky

Software engineer working on .NET data access and performance, member of the Entity Framework team. Lead dev of Npgsql, the PostgreSQL provider. Die-hard Linux command-line dude.

22 comments

Discussion is closed. Login to edit/delete existing comments.

Cole Larson August 2, 2022

I just moved to preview 6. Scaffolding from an existing DB is now annoyingly changing Name capitalization in the model For example my DB Named BBSTempEntity now generates a context called BbstempContext . Is this a planned behavior?
- Shay Rojansky Author August 6, 2022
  
  Can you please file an issue on the EF Core repo?
Bart Koelman July 16, 2022

Too bad that many-to-many relationships are broken (first time in a patch release, providing a back-compat switch), but now the switch has been removed in EF 7. This makes it impossible for us to update our project (572 stars,145 forks, 389k downloads) to EF 7. I’ve been asking for a fix/workaround five months ago (see https://github.com/dotnet/efcore/issues/27436), but still no solution.
- Bart Koelman August 6, 2022
  
  Update: A solution has been provided by the EF Core team (see linked issue), which unblocks us from upgrading.
Eli Black July 14, 2022

Very exciting! 🙂
Eli Black July 14, 2022

The article mentions “The update pipeline improvements in .NET 7.0 are quite different.” Should that be “EF Core 7” instead of “.NET 7”?
- Shay Rojansky Author July 15, 2022
  
  Thank you for catching that! Fixed.
  - Eli Black July 18, 2022
    
    Thanks! 🙂
Michael Stancombe July 14, 2022

Interesting Read! On the subject of bulk inserts, I recall that SQL will cache query plans based on the query text and hence creating a differing query text between inserting, say 3 blogs and 5 blogs would mean a new execution plan calculation per number of items inserted, which might be some extra overhead you could attack.
If you migrated a table value parameter type for the entities, you could achieve this using a static sql expression. Not sure if that’s feasible given the permutations of SQL that partial entity updates emit, but were table valued parameters investigated?
- Shay Rojansky Author July 15, 2022
  
  Your point on the query plans is spot on. The problem with table-value parameters is that they require a preexisting type definition on the SQL Server side for types which will get imported – that’s not really appropriate for EF Core, and is the main reason we haven’t gone down that route.
  
  I also explored batching together multiple single-row INSERTs, which wouldn’t have this problem (since each INSERT is identical), but unfortunately the perf there is far inferior to a single MERGE (or to a single multi-row INSERT, for that matter).
Ryan Lamansky July 14, 2022

It’s not well known but INSERT can do multiple records in a single command, too: INSERT MyTable VALUES (row1c1, row1c2), (row2c1, row2c2).

I wonder if that would outperform some of these merges by virtue of being simpler / easier for SQL to parse. Obviously not as flexible, though.
- Shay Rojansky Author July 14, 2022
  
  Yes, INSERT with multiple records is simple and performs relatively well. The problem with that is that we need to get the db-generated IDs back in order to populate them in the .NET instances (that's the OUTPUT clause you see above); unfortnuately INSERT doesn't provide any guarantee on the ordering of the IDs or which rows they matched to - you just get a bunch of IDs back and have no idea what they map to.
  
  The MERGE syntax above allows us to roundtrip a synthetic _Position column which allows us to match the ID with the Blog instance we inserted;...
  Read more
  Yes, INSERT with multiple records is simple and performs relatively well. The problem with that is that we need to get the db-generated IDs back in order to populate them in the .NET instances (that’s the OUTPUT clause you see above); unfortnuately INSERT doesn’t provide any guarantee on the ordering of the IDs or which rows they matched to – you just get a bunch of IDs back and have no idea what they map to.
  
  The MERGE syntax above allows us to roundtrip a synthetic _Position column which allows us to match the ID with the Blog instance we inserted; that’s the only reason MERGE is being used here over INSERT.
  
  Read less
Gábor Szabó July 13, 2022

Very cool improvements! It's great to know EF is better at simple things too than simple developers like how I would do it.

I noticed you said "isn't much we can do about it" when talking about inserting a dependent entry with its principal. Isn't it possible to write the second statement in a way that references the inserted ID from the first (maybe something like )? That would make it possible (in some very common scenarios at least, like in the example) to bypass the roundtrip in this case as well, just by sending the two commands not...
Read more
Very cool improvements! It’s great to know EF is better at simple things too than simple developers like how I would do it.

I noticed you said “isn’t much we can do about it” when talking about inserting a dependent entry with its principal. Isn’t it possible to write the second INSERT statement in a way that references the inserted ID from the first INSERT (maybe something like scope_identity())? That would make it possible (in some very common scenarios at least, like in the example) to bypass the roundtrip in this case as well, just by sending the two commands not unlike you show in the next sections.

Read less
- Shay Rojansky Author July 13, 2022
  
  Yeah, there are some tricks we can do in very specific scenarios. One problem with your scope_identity proposal is that once you insert a 2nd dependent, its own IDENTITY would overwrite the principal’s; so it would only for a single principal/dependent pair. It’s certainly a direction to explore though.
  - Троепольский Алексей July 30, 2022 · Edited
    
    We can put Inserted.Id and position to temporary table. And then select generated Id from temporary table by position.
    
    That allow us to implement batching with multiple parent and multiple dependent inserts
  - Shay Rojansky Author July 30, 2022
    
    Yes, this is what EF Core does prior to version 7.0; the problem is that involving the temporary table adds a very heavy performance overhead. This is exactly why the optimization was done to avoid it in EF Core 7.0.
    
    Benchmark results are available here, I’d suggest carefully benchmarking the cost of using the temporary table (OUTPUT INTO) vs. not using (OUTPUT without INTO).
Russell Hart July 12, 2022

Nice improvements however a breaking change for triggers will affect a lot of systems in my experience. Have we considered moving the begin/commit transaction statements into the same batch?
This should no longer break for triggers and still eliminate the network round trips.
- Shay Rojansky Author July 13, 2022
  
  The trigger problem is related to the OUTPUT clause used with e.g. INSERT or MERGE; I'm not sure how moving the begin/commit transaction statements would help with that - these would be orthogonal optimizations.
  
  I've looked into batching begin/commit instead of doing separate roundtrips, but there are some tricky aspects; for example, a concurrency failure is only detected at the client by EF Core, and should trigger a rollback, but if the commit is batched than the SaveChanges has already been committed and cannot be rolled back. Though it's still something I want to think about and possibly improve in the...
  Read more
  The trigger problem is related to the OUTPUT clause used with e.g. INSERT or MERGE; I’m not sure how moving the begin/commit transaction statements would help with that – these would be orthogonal optimizations.
  
  I’ve looked into batching begin/commit instead of doing separate roundtrips, but there are some tricky aspects; for example, a concurrency failure is only detected at the client by EF Core, and should trigger a rollback, but if the commit is batched than the SaveChanges has already been committed and cannot be rolled back. Though it’s still something I want to think about and possibly improve in the future.
  
  Read less
  - Gary July 14, 2022
    
    Batching the begin transaction only might still be worth it especially for systems on the cloud. The commit could be handled as a round trip.
  - Shay Rojansky Author July 15, 2022
    
    Very true. The Npgsql provider already does this at the ADO.NET layer, so DbConnection.BeginTransaction() doesn’t actually do anything; the next command to actually be executed is preceded by the BEGIN. I’ve opend an issue for SqlClient to do the same – it’s always better to do this kind of optimization at the lowest possible layer, this way everyone benefits (not just EF users).
Rand Random July 12, 2022

So, the explained performance improvements are Microsoft.EntityFrameworkCore.SqlServer only, and do not apply for any other provider?
- Shay Rojansky Author July 13, 2022
  
  The 2nd optimization (bulk update) is indeed only for SQL Server, but the other two (the roundtrip reduction) do apply for other providers.