{"id":715,"date":"2020-12-11T09:39:48","date_gmt":"2020-12-11T17:39:48","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/azure-sql\/?p=715"},"modified":"2024-03-15T08:53:03","modified_gmt":"2024-03-15T15:53:03","slug":"programmatically-parsing-transact-sql-t-sql-with-the-scriptdom-parser","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/azure-sql\/programmatically-parsing-transact-sql-t-sql-with-the-scriptdom-parser\/","title":{"rendered":"Programmatically parsing Transact SQL (T-SQL) with the ScriptDom parser"},"content":{"rendered":"<p>.NET Developers are perhaps familiar with libraries like <a href=\"https:\/\/github.com\/dotnet\/roslyn\" target=\"_blank\" rel=\"noopener noreferrer\">Roslyn<\/a> and <a href=\"https:\/\/docs.microsoft.com\/en-us\/dotnet\/framework\/reflection-and-codedom\/using-the-codedom\" target=\"_blank\" rel=\"noopener noreferrer\">CodeDOM<\/a>, which allow in-depth analysis of code (amongst many other things). Java developers use tools like <a href=\"https:\/\/github.com\/spotbugs\/spotbugs\" target=\"_blank\" rel=\"noopener noreferrer\">SpotBugs<\/a> for code analysis. As a SQL developer, perhaps you might have wondered if there are any equivalent functionality for Azure SQL and SQL Server code? Well, wonder no more!<\/p>\n<p>Over the years, I have leveraged the very useful <a href=\"https:\/\/docs.microsoft.com\/en-us\/dotnet\/api\/microsoft.sqlserver.transactsql.scriptdom.tsqlparser?view=sql-dacfx-150\" target=\"_blank\" rel=\"noopener noreferrer\">TransactSql ScriptDom .NET library<\/a>, to programmatically parse Transact-SQL (T-SQL) statements, expressions, and scripts. This library accurately produces an Abstract Syntax Tree (AST) representation of the T-SQL code. Once you have such an AST, you can \u201cwalk\u201d the tree to analyze it for various code patterns. As an example, you can use it to <a href=\"https:\/\/github.com\/arvindshmicrosoft\/SQLScriptDomSamples\/blob\/b9105106319718d1dffb5ae3cca4482d892c518d\/4_TSQLValidation\/Form1.cs#L60-L70\" target=\"_blank\" rel=\"noopener noreferrer\">detect in a robust way<\/a> if you have any DML statement other than SELECT \u2013 which may be useful to detect some forms of SQL injection. You can even change the AST if needed and produce a modified T-SQL script with the re-written tree. The <a href=\"#use-cases\">possibilities<\/a> are endless!<\/p>\n<h2 class=\"wp-block-heading\" id=\"use-cases\">Use Cases<\/h2>\n<h2>\n<\/h2>\n<p>I am aware of several interesting use cases where this parser has been leveraged:<\/p>\n<ul>\n<li>A customer uses <a href=\"https:\/\/docs.microsoft.com\/en-us\/azure\/azure-sql\/database\/auditing-overview\" target=\"_blank\" rel=\"noopener noreferrer\">Azure SQL auditing<\/a>, coupled with the <a href=\"https:\/\/www.nuget.org\/packages\/Microsoft.SqlServer.XEvent.XELite\/\" target=\"_blank\" rel=\"noopener noreferrer\">XELite .NET library<\/a>, to read XEL audit files containing T-SQL text. The T-SQL text is then parsed to get analytics on which tables, views and columns are queried most frequently.<\/li>\n<li><a href=\"https:\/\/www.linkedin.com\/in\/sqlbobt\/\" target=\"_blank\" rel=\"noopener noreferrer\">boB Taylor<\/a> uses ScriptDom in his <a href=\"https:\/\/github.com\/sqlbobt-zz\/SqlCommandFilters\" target=\"_blank\" rel=\"noopener noreferrer\">SqlCommandFilters<\/a> project to auto-parameterize T-SQL statements so that they could be used in the context of Always Encrypted.<\/li>\n<li>The <a href=\"https:\/\/github.com\/keif888\/SQLServerMetadata\" target=\"_blank\" rel=\"noopener noreferrer\">SQL Metadata Toolkit<\/a> uses ScriptDom to find and map dependencies across the SQL code base.<\/li>\n<li>The <a href=\"https:\/\/github.com\/aeslinger0\/sqlsharpener\/wiki\/Examples\" target=\"_blank\" rel=\"noopener noreferrer\">SQL Sharpener<\/a> project uses ScriptDom to parse T-SQL and then generate C# wrappers for those stored procedures.<\/li>\n<li><a href=\"https:\/\/github.com\/HealthCatalyst\/hcposh\" target=\"_blank\" rel=\"noopener noreferrer\">HealthCatalyst &#8216;HCPosh&#8217; Powershell module<\/a>, use the parser to programmatically understand the structure of SQL statements.<\/li>\n<li>Last but definitely not the least, many <a href=\"#code-analysis\">T-SQL static code analysis<\/a> projects use ScriptDom!<\/li>\n<\/ul>\n<p>In addition, members of the SQL community have some great articles showing practical usage of ScriptDom \/ SQLDOM to do useful tasks:<\/p>\n<ul>\n<li>Mala Mahadevan has an <a href=\"https:\/\/www.sqlservercentral.com\/steps\/stairway-to-scriptdom-level-2-parsing-tokens-and-the-abstract-syntax-tree\">excellent series<\/a> on the ScriptDom<\/li>\n<li>Erik Ejlskov Jensen has a <a href=\"https:\/\/erikej.github.io\/dacfx\/codeanalysis\/sqlserver\/2024\/04\/02\/dacfx-codeanalysis.html\" target=\"_blank\" rel=\"noopener\">great post<\/a> showing how to leverage ScriptDom internally for T-SQL code analysis in a Database Project.<\/li>\n<li><a href=\"https:\/\/sqlperformance.com\/2020\/09\/sql-performance\/paramparser-1\" target=\"_blank\" rel=\"noopener noreferrer\">Parse parameter default values using PowerShell \u2013 Part 1 \u2013 SQLPerformance.com<\/a><\/li>\n<li><a href=\"https:\/\/www.dbdelta.com\/microsoft-sql-server-script-dom\/\" target=\"_blank\" rel=\"noopener noreferrer\">Microsoft SQL Server Script DOM | Dan Guzman\u2019s Blog (dbdelta.com)<\/a><\/li>\n<li><a href=\"https:\/\/michaeljswart.com\/2014\/04\/removing-comments-from-sql\/\" target=\"_blank\" rel=\"noopener noreferrer\">Removing Comments from SQL | Michael J. Swart<\/a><\/li>\n<li><a href=\"https:\/\/the.agilesql.club\/2015\/11\/how-to-get-started-with-the-scriptdom\/\" target=\"_blank\" rel=\"noopener noreferrer\">How to get started with the ScriptDom \u00b7 The Agile SQL Club<\/a><\/li>\n<\/ul>\n<p>Forgive me if I missed some useful examples \u2013 my attempt was not to list all possible related ones, just recent ones I am familiar with. Please do leave comments if you feel there is a relevant example which should be linked.<\/p>\n<p>Within the realm of official Microsoft releases, ScriptDom is also one of the core building blocks for <a href=\"https:\/\/docs.microsoft.com\/en-us\/dotnet\/api\/microsoft.sqlserver.dac\" target=\"_blank\" rel=\"noopener noreferrer\">DacFx<\/a>\u00a0and associated tooling like\u00a0<a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/tools\/sqlpackage\" target=\"_blank\" rel=\"noopener noreferrer\">SqlPackage<\/a>. It is also part of the <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/powershell\/sql-server-powershell\" target=\"_blank\" rel=\"noopener noreferrer\">SqlServer PowerShell module<\/a>, and many other projects, both at Microsoft, and in the broader community. This library is immensely powerful, and I believe that all SQL developers should be aware of the power and scope it provides.<\/p>\n<h2>\n<\/h2>\n<h2 class=\"wp-block-heading\">Get Started!<\/h2>\n<p>If you are interested in learning about how you can use this library, <a href=\"https:\/\/github.com\/arvindshmicrosoft\/SQLScriptDomSamples\" target=\"_blank\" rel=\"noopener noreferrer\">my GitHub repo<\/a> has some quite simple examples to get started.\u00a0Once you add the necessary references to ScriptDom, you can instantiate a TSqlParser class specific to the target version of SQL that you are working with. Currently for Azure SQL and SQL Server 2019, that would be the <a href=\"https:\/\/docs.microsoft.com\/en-us\/dotnet\/api\/microsoft.sqlserver.transactsql.scriptdom.tsql150parser\" target=\"_blank\" rel=\"noopener noreferrer\">TSqlParser150<\/a> class:<\/p>\n<pre>            using (var rdr = new StringReader(textBox1.Text))<br \/> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 {<br \/> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 IList&lt;ParseError&gt; errors = null;<br \/> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 var parser = new TSql150Parser(true, SqlEngineType.All);<br \/> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 var tree = parser.Parse(rdr, out errors);<br \/>                ...<\/pre>\n<p>BTW, this parser now supports many constructs of the T-SQL syntax specific to Azure Synapse Analytics as well. For example, it can parse <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/t-sql\/statements\/create-table-as-select-azure-sql-data-warehouse\">CTAS<\/a> statements! Anyways, once the parser produces the AST, you can use the well-known <a href=\"https:\/\/en.wikipedia.org\/wiki\/Visitor_pattern\" target=\"_blank\" rel=\"noopener noreferrer\">Visitor pattern<\/a> to walk the tree. Usually this is done by implementing a class which can then store state as the tree is walked:<\/p>\n<pre>\u00a0\u00a0\u00a0 class MyVisitor : TSqlFragmentVisitor<br \/> \u00a0\u00a0 {<br \/> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 internal bool containsOnlySelects = true;<br \/> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 public override void Visit(TSqlStatement node)<br \/> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 {<br \/> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 if ((node as SelectStatement) is null)<br \/> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 {<br \/> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 containsOnlySelects = false;<br \/> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 }<br \/><br \/> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 base.Visit(node);<br \/> \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 }<br \/> \u00a0\u00a0 }<\/pre>\n<p>And then, in the main code, we call .Parse(), thereby creating an instance of the \u201cwalker\u201d class and walk the AST:<\/p>\n<pre>      MyVisitor checker = new MyVisitor();<br \/>      tree.Accept(checker);<br \/><br \/><\/pre>\n<h2>Video<\/h2>\n<p>To know more, you can examine the samples on my repo. You can also watch a detailed presentation on this topic done to the PASS AppDev virtual chapter session: <a href=\"https:\/\/www.youtube.com\/watch?v=CciVxRFXgH8\" target=\"_blank\" rel=\"noopener noreferrer\">T-SQL Swiss Knife using the ScriptDom T-SQL Parser by Arvind Shyamsundar<\/a>. And I have blogged in the past about usage of this parser on my personal blog, including using ScriptDom to format T-SQL etc.:\u00a0<a href=\"https:\/\/deep.data.blog\/tag\/sqldom\/\" target=\"_blank\" rel=\"noopener noreferrer\">Arvind Shyamsundar\u2019s technical blog<\/a><\/p>\n<h2 class=\"wp-block-heading\" id=\"code-analysis\">ScriptDom in SQL Projects (Code Analysis)<\/h2>\n<p>ScriptDom being a full-fidelity parser, is great for statically analyzing T-SQL code to proactively identify anti-patterns. Historically, we have had Code Analysis capabilities within Visual Studio SQL Projects. Visual Studio SQL Projects allow you to <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/ssdt\/overview-of-extensibility-for-database-code-analysis-rules\">extend the built-in code analysis rules<\/a> and implement your own custom code analysis rules. Here are some examples of OSS projects for custom code analysis within Visual Studio SQL Projects (earlier known as &#8220;SSDT&#8221;). They all leverage ScriptDom:<\/p>\n<ul>\n<li><a href=\"https:\/\/github.com\/arvindshmicrosoft\/DACExtensions\/tree\/master\/RuleSamples\" target=\"_blank\" rel=\"noopener noreferrer\">DACExtension Rule Samples<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/davebally\/TSQL-Smells\" target=\"_blank\" rel=\"noopener noreferrer\">TSQL-Smells<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/tcartwright\/SqlServer.Rules\" target=\"_blank\" rel=\"noopener noreferrer\">Rules<\/a><\/li>\n<li><a href=\"https:\/\/github.com\/colindooley11\/SSDT-Code-Analysis\/tree\/Branch-Test\/Carnage.Core.Rules.Data\/Carnage.Core.Rules.Data\/Rules\" target=\"_blank\" rel=\"noopener noreferrer\">SSDT-Code-Analysis<\/a><\/li>\n<\/ul>\n<p>BTW, SQL Projects is recently also available as an <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/azure-data-studio\/extensions\/sql-database-project-extension\" target=\"_blank\" rel=\"noopener noreferrer\">preview extension for Azure Data Studio<\/a>.\u00a0Also, a related community-led project which allows developers to build DACPACs without Visual Studio is <a href=\"https:\/\/github.com\/rr-wfm\/MSBuild.Sdk.SqlProj\" target=\"_blank\" rel=\"noopener noreferrer\">rr-wfm\/MSBuild.Sdk.SqlProj<\/a>. While this project does not use ScriptDom directly, it shows how to leverage the broader DacFx library.<\/p>\n<p>Again, forgive me if I missed some useful code analysis extension which uses ScriptDom. Please do leave comments if you feel there is a relevant post which should be linked.<\/p>\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n<p>The parser is immensely powerful if you are interested in programmatically parsing T-SQL. Since the parser is an official release from Microsoft, and largely based on the same grammar that the core SQL engine uses, it is the most full-fidelity choice for parsing T-SQL currently available. And best of all it is freely available, as part of the\u00a0<a href=\"https:\/\/www.nuget.org\/packages\/Microsoft.SqlServer.DacFx\">DacFx NuGet<\/a>!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Azure SQL developers have access to a full-fidelity, highly accurate, and easy-to-use client-side parser for T-SQL statements: the TransactSql.ScriptDom parser. This blog post walks through basic usage, and links to a number of resources for digging deeper. <\/p>\n","protected":false},"author":32627,"featured_media":81,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[444,1,92],"tags":[465,30,466,34],"class_list":["post-715","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-net","category-azure-sql","category-devops","tag-azuresql","tag-developers","tag-modernsql","tag-t-sql"],"acf":[],"blog_post_summary":"<p>Azure SQL developers have access to a full-fidelity, highly accurate, and easy-to-use client-side parser for T-SQL statements: the TransactSql.ScriptDom parser. This blog post walks through basic usage, and links to a number of resources for digging deeper. <\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/posts\/715","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/users\/32627"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/comments?post=715"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/posts\/715\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/media\/81"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/media?parent=715"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/categories?post=715"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/tags?post=715"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}