April 24th, 2023

Using Ledger in Azure SQL Database

Brian Spendolini
Senior Product Manager, Azure SQL Database

Transactions have to be protected and you need to know when something is altered or prevent altering of records from the start. Ledger, available in Azure SQL Database and SQL Server 2022, helps to protect your database and establish trust, accuracy and integrity in your data.

Ledger accomplishes this goal in a couple of ways. First, it provides you with a history of records in your database; if a row was updated or deleted in the database, its previous value is maintained and protected in a history table. This provides you with a chronicle of all changes made to the table over time. You can even enable this feature database wide so that all tables can have this historical information.

The second way Ledger helps is by creating insert only tables. Update and deletes cannot be done on these tables making them perfect for scenarios such as security information and event management (SIEM) applications. Both these features incorporate a blockchain that reside in the Azure SQL Database ledger. More on how Azure SQL Database uses blockchain at the end of this post.

For database and application developers, this is a very simple to enable feature that will help to create trust between the data and your stakeholders.

Create an Updatable Ledger Table

Important!

Creating ledger tables requires the ENABLE LEDGER permission.

Let’s create some of these tables and see how they work. We will start with an Updatable Ledger Table. To note, both updatable ledger tables and temporal tables are system-versioned tables; they are created with the SYSTEM_VERSIONING clause. When set, the database engine captures historical row versions in a secondary history table (SYSTEM_VERSIONING = ON (HISTORY_TABLE = [SCHEMA].[HISTORY_TABLE_NAME]). The main difference is that updatable ledger tables make both the current and historical data tamper evident.

The code to create this table is quite simple, just like a regular table, but with LEDGER = ON in the SYSTEM_VERSIONING clause. For this example, we can use a Video Game high scores table so that we can record who got what score on a particular game and know if someone went and inflated/changed their score.

CREATE TABLE [dbo].[HighScores]
(
    [ScoreID] INT IDENTITY,
    [PlayerID] INT NOT NULL,
    [FirstName] VARCHAR (50) NOT NULL,
    [LastName] VARCHAR (50) NOT NULL,
    [Game] VARCHAR (50) NOT NULL,
    [Score] INT NOT NULL,
    [Date] DATE NOT NULL
)
WITH
(
 SYSTEM_VERSIONING = ON (HISTORY_TABLE = [dbo].[HighScoresHistory]),
 LEDGER = ON
);

The database automatically adds four GENERATED ALWAYS columns that contain metadata noting which transactions made changes to the table and the order of operations by which rows were updated.

Default column name Data type Description
ledger_start_transaction_id bigint The ID of the transaction that created a row version
ledger_end_transaction_id bigint The ID of the transaction that deleted a row version
ledger_start_sequence_number bigint The sequence number of an operation within a transaction that created a row version
ledger_end_sequence_number bigint The sequence number of an operation within a transaction that deleted a row version

And we can see this in the table structure after the create table event (Azure Data Studio even marks the table as an Updatable Ledger table as well):

Image Screenshot 2023 04 20 at 1 08 58 PM

Working With the Table

Time to add some data to the table:

INSERT INTO [dbo].[HighScores]
VALUES (1, 'Steve', 'Wobble', 'Horse Monkey', 11099912, GETDATE()),
(2, 'Bill', 'Mickelson', 'Horse Monkey', 11099911, GETDATE()),
(3, 'John', 'Hill', 'Universe Invaders', 218870, GETDATE());

and look at the table with the generated columns:

select
   [ScoreID],
   [PlayerID],
   [FirstName],
   [LastName],
   [Game],
   [Score],
   [Date],
   [ledger_start_transaction_id],
   [ledger_end_transaction_id],
   [ledger_start_sequence_number],
   [ledger_end_sequence_number]
from [dbo].[HighScores];
For each updatable ledger table, there is a view created. For our HighScores table, the view [dbo].[HighScores_Ledger] was created.
Image Screenshot 2023 04 20 at 1 10 08 PM
This table has four special columns as well:
Default column name Data type Description
ledger_transaction_id bigint The ID of the transaction that created or deleted a row version.
ledger_sequence_number bigint The sequence number of a row-level operation within the transaction on the table.
ledger_operation_type tinyint Contains 1 (INSERT) or 2 (DELETE). Inserting a row into the ledger table produces a new row in the ledger view that contains 1 in this column. Deleting a row from the ledger table produces a new row in the ledger view that contains 2 in this column. Updating a row in the ledger table produces two new rows in the ledger view. One row contains 2 (DELETE), and the other row contains 1 (INSERT) in this column.
ledger_operation_type_desc nvarchar(128) Contains INSERT or DELETE. For more information, see the preceding row.
Now, what if someone went into the database and updated that row? And can the database tell us who? Join the ledger view to the sys.database_ledger_transactions table and get that “who”.
SELECT
 t.[commit_time] AS [CommitTime],
 t.[principal_name] AS [UserName],
 h.[ScoreID],
 h.[PlayerID],
 h.[FirstName],
 h.[LastName],
 h.[Game],
 h.[Score],
 h.[Date],
 h.[ledger_operation_type_desc] AS Operation
 FROM [dbo].[HighScores_Ledger] h
 JOIN sys.database_ledger_transactions t
 ON t.transaction_id = h.ledger_transaction_id
 ORDER BY t.commit_time DESC;
Image Screenshot 2023 04 20 at 1 18 13 PM

Here Comes Bill

I am currently logged into the database as sqladmin and it shows because I created those inserts. What if someone else were to do this with another account, maybe Bill?
(Bill is using the new Azure Data Studio Create User feature!)
Image Screenshot 2023 04 19 110919
Then, Bill does an update to the table:
UPDATE [dbo].[HighScores]
    SET Score = 11099913
    WHERE ScoreID = 2;
What’s the ledger view look like now?
Image Screenshot 2023 04 20 at 1 21 29 PM
In the blue box, we see that Bill performed the record change. In the green box, we see the old record marked as a delete and in the orange box, the new updated record.
Why a delete? From the docs:
Contains 1 (INSERT) or 2 (DELETE). Inserting a row into the ledger table produces a new row in the ledger view that contains 1 in this column. Deleting a row from the ledger table produces a new row in the ledger view that contains 2 in this column. Updating a row in the ledger table produces two new rows in the ledger view. One row contains 2 (DELETE), and the other row contains 1 (INSERT) in this column.

 

Create an Append-only Ledger Table

Append-only ledger tables allow INSERT operations on the table; DELETES and UPDATES cannot be done. This is great for use cases such as recoding entry into a building or room with a key card; any scenario where you need to insert events as they happen.

Creating one is similar to the Updatable table, but with the following clause: WITH (LEDGER = ON (APPEND_ONLY = ON)). For this next example, we can imagine a special room where these arcade games are kept and only by using a special ID card, can you get access to play them for high scores. You need to use a badge to enter and exit this room.

CREATE TABLE [dbo].[ArcadeRoomAccess]
   (
      [AccessID] INT IDENTITY,
      [PlayerID] INT NOT NULL,
      [AccessEvent] NVARCHAR (100) NOT NULL,
      [Timestamp] Datetime2 NOT NULL
   )
   WITH (LEDGER = ON (APPEND_ONLY = ON));
Again, just like the Updatable table, the database automatically adds GENERATED ALWAYS columns. Here we only have two:
Default column name Data type Description
ledger_start_transaction_id bigint The ID of the transaction that created a row version
ledger_start_sequence_number bigint The sequence number of an operation within a transaction that created a row version

Which we can see in Azure Data Studio by looking at the table’s columns:

Image Screenshot 2023 04 19 123540

A view is also created:

Image Screenshot 2023 04 19 123522

Now, let’s insert some data into this table:

INSERT INTO [dbo].[ArcadeRoomAccess]
VALUES(1,'IN',DATEADD(HOUR, -5,getdate())),
(1,'OUT',DATEADD(HOUR, -3,getdate())),
(2,'IN',DATEADD(HOUR, -15,getdate())),
(2,'OUT',DATEADD(HOUR, -10,getdate()));

And we can see the ledger information with the views and generated columns with the following SQL query:

SELECT
t.[commit_time] AS [CommitTime],
t.[principal_name] AS [UserName],
h.[AccessID],
h.[PlayerID],
h.[AccessEvent],
h.[Timestamp],
h.[ledger_operation_type_desc] AS Operation
FROM [dbo].[ArcadeRoomAccess_Ledger] h
JOIN sys.database_ledger_transactions t
ON t.transaction_id = h.ledger_transaction_id
ORDER BY t.commit_time DESC;

Update the Un-Updatable

Trying an UPDATE or DELETE results in the following:

UPDATE [dbo].[ArcadeRoomAccess]
SET Timestamp = GETDATE()
WHERE AccessID = 4;
and we see the following error:
Msg 37359, Level 16, State 1, Line 1
Updates are not allowed for the append only Ledger table 'dbo.ArcadeRoomAccess'.

Database Ledger

Taking a step back, how does the database use blockchain with Ledger? To start, any rows modified by a transaction (insert, update, or delete) in a table that is ledger enabled are cryptographically SHA-256 hashed using a Merkle tree data structure. This event also creates a root hash representing all rows in the database transaction. At a set interval, transactions are also SHA-256 hashed together through a Merkle tree data structure resulting in a root hash that forms a block. This sets up a process for the block to be SHA-256 hashed through the root hash of the block, along with the root hash of the previous block thus forming a blockchain.

Putting these two processes together, we have the following flow in the database when a transaction occurs:

  • Persist the earlier version of the row in the history table.
  • Assign the transaction ID and generate a new sequence number, persisting them in the appropriate system columns.
  • Serialize the row content and include it when computing the hash for all rows updated by this transaction.

You can view the information about these blocks in two system catalog views:

  • sys.database_ledger_transactions: Maintains a row with the information of each transaction in the database ledger. T
  • sys.database_ledger_blocks: Maintains a row for every block in the ledger, including the root of the Merkle tree over the transactions within the block and the hash of the previous block to form a blockchain.

To view the database ledger, run the following statements in your database.

Important!

Querying these views requires the VIEW LEDGER CONTENT permission

SELECT * FROM sys.database_ledger_transactions;
GO

SELECT * FROM sys.database_ledger_blocks;
GO

Here is what it looks like in the test database i’ve been using for this post:

Image Screenshot 2023 04 20 at 12 34 50 PM

Database Digests

Lastly, the hash of the latest block in a database ledger is called the database digest. These digests are JSON docs you can generate from T-SQL via the following command:

EXECUTE sp_generate_database_ledger_digest;

But even better, you have the option of having these JSON files placed in immutable storage feature of Azure Blob Storage with Azure Confidential Ledger. This helps to protect these digests from potential tampering. Setting this up is simple because you can configure automatic generation and storage of database digests through the Azure portal, PowerShell, or the Azure CLI.

Image Screenshot 2023 04 20 at 12 42 43 PM

There is even a built in Database ledger verification process that scans all ledger and history tables and discovers evidence of data tampering.

Summary

As you can see, Ledger in Azure SQL Database is a very powerful feature that can help you ensure that the data in your database is correct and has not been altered. Enable it via T-SQL or right in the Azure Portal so that all new tables are ledger enabled by default.

Here are some helpful links for more information on Ledger:

Author

Brian Spendolini
Senior Product Manager, Azure SQL Database

3 comments

Discussion is closed. Login to edit/delete existing comments.

  • Wilfred van Dijk

    Ledger is great for registering data which cannot be modified, but I only want to store this data for 1 year. Is this combination possible? (not updatable and not deleteable, but keep the data for # days)

    • Pieter VanhoveMicrosoft employee

      Hi Wilfred,

      This is a common question that we get from our customers. We’re currently looking into the different possibilities to address this problem.

      Pieter

    • Brian SpendoliniMicrosoft employee Author

      Unfortunetly, removing data after a set amount of time would break the block chain. You can review some of the limitations here.

      If you needed more flexibility, temporal tables might fit your use case? More on those here if you have not used them yet.

      Read more