Introducing TransactionalBatch in the Azure Cosmos DB .NET SDK
Recently we talked about the new Bulk support introduced in the .NET SQL SDK, but along with it, another great feature was also released: support for TransactionalBatch.
Wait, Bulk? Batch? What’s the difference?
While Bulk describes scenarios that require a high degree of throughput to process a high volume of point operations, these operations can succeed or fail independently.
TransactionalBatch describes a group of point operations that need to either succeed or fail. If all operations, in the order that are described in the TransactionalBatch, succeed, the transaction is committed. If any operation fails, the entire transaction is rolled back.
So, what is a transaction for Cosmos DB?
A transaction in a typical database can be defined as a sequence of operations performed as a single logical unit of work. Each transaction provides ACID (Atomicity, Consistency, Isolation, Durability) property guarantees.
- Atomicity guarantees that all the operations done inside a transaction are treated as a single unit, and either all of them are committed or none of them are.
- Consistency makes sure that the data is always in a valid state across transactions.
- Isolation guarantees that no two transactions interfere with each other – many commercial systems provide multiple isolation levels that can be used based on the application needs.
- Durability ensures that any change that is committed in a database will always be present.
Azure Cosmos DB supports full ACID compliant transactions with snapshot isolation for operations within the same logical partition key.
Got it. How do I create a TransactionalBatch now?
Creating a TransactionalBatch is a very descriptive operation, you basically start from a Container instance and call CreateTransactionalBatch:
And when you are ready, just call ExecuteAsync:
When the response comes back, you need to examine if it’s a success or not, and extract the results:
In case of a failure, the failing operation will have the StatusCode of its corresponding error, while all the other operations will have a 424 StatusCode (Failed Dependency). So, it’s quite easy to identify the cause of the transaction failure.
How is that TransactionalBatch executed?
When ExecuteAsync is called, all operations in the TransactionalBatch are grouped, serialized into a single payload, and sent as a single request to the Azure Cosmos DB service.
The service receives the request and executes all operations within a transactional scope, and returns a response using the same serialization protocol. This response is either a success, or a failure, and contains all the individual operation responses internally.
The SDK exposes the response for you to verify the result and, optionally, extract each of the internal operation results. Quite simple.
Why would I use TransactionalBatch?
It is known that Azure Cosmos DB supports Stored Procedures, which also provide transactional scope on its operations, but why would you use TransactionalBatch vs Stored Procedures?
- Code versioning – Versioning application code and onboarding it on your CI/CD pipeline is much more natural than orchestrating the update of a Stored Procedure and making sure the rollover happens at the right time. It also makes rolling back changes much easier.
- Performance – We have seen reduction in latency for equivalent operations of up to 30% when comparing them with a Stored Procedure execution.
- Content serialization – Each operation within a TransactionalBatch can leverage custom serialization options for its payload.
Are there any known limits?
Currently, there are three known limits:
- As per the Azure Cosmos DB request size limit, the size of the TransactionalBatch payload cannot exceed 2MB, and the maximum execution time is 5 seconds.
- There is a current limit of 100 operations per TransactionalBatch to make sure the performance is as expected and within SLAs.
If you want to try out TransactionalBatch, you can follow our sample. Please share any feedback on our official Github repository.
This looks really useful, thanks.
I’m wondering if we would be able to use it to replace a stored procedure we’re currently using, but to do so we need to get a value from one item, and then set a property in the second item to that value. I don’t think this is possible currently?
To give some background, we use a counter document to store the current highest “ID” for our collection of items. When we insert new items, via a stored procedure, we get the current value of this counter document, set the “ID” for the item we’re inserting and then increment the counter document – within a single transaction. (I realise this is a bad idea for various reasons, but we have no choice as we migrated to Cosmos DB from another database and it the entire system is built around this legacy design decision).
As long as everything is in the same partition then yes, should work!
Thanks, I figured it should be possible but I wasn’t sure how to actually write something using the .NET API that executes a query, then uses the result of that in the next query in the transaction.
Do you have an example that shows how to do what Tom Robinson is asking? I need to take something read from one document and use it to link to another document in the transaction. I don’t see any examples where the Child references something from its parent.
Using a value from one operation as input of another within the same TransactionalBatch is not currently supported.