FSharp.Data.SqlClient: Seamlessly integrating T-SQL and F# in the same code [Guest Post]

 

This week's post is a guest article writtten by Dmitry Sevastianov with help from Dmitry Morozov, two members of the F# community, about FSharp.Data.SqlClient, a fascinatingly simple and elegant technique to integrate T-SQL and F# in new ways, using the ever-awesome power of F# type providers.

First, be warned - this post is NOT about LINQ! It is about how to do LINQ-like things, but using a technique that is much simpler yet better in certain ways, though with pros and cons. It is also an industry-relevant example of a technique for compile-time checking of embedded DSLs that has applicability well beyond SQL-data programming.

Usual disclaimer: Note this is a guest article about a community-provided library, and the content below doesn't represent the views of Microsoft or the Visual F# Tools team.

Enjoy!

Don Syme, for the Visual F# Tools Team

FSharp.Data.SqlClient: Bridging the gap between F# and T-SQL scripting via F# Type Providers

In this post, we will explore the FSharp.Data.SqlClient library, and in particiular the SqlCommandProvider, which allows a new way of using both SQL syntax and F# syntax in data scripting.

Nearly all readers will be familiar with LINQ from either C# or F#. LINQ queries were integrated in F# in F# 3.0, expecially through Query Expressions. These are a lovely and useful technology. LINQ queries make writing data access code a pleasant exercise. LINQ in F# builds upon both query expressions and the existing F# type providers for databses, used to expose structured data in F# code. There are several different implementations of LINQ for SQL data available for F#, for example SQLProvider (a recent powerful F# community effort, based on direct ADO.NET, avoiding dependencies on Entity Framework, and usable with many different database technologies), SqlDataConnection (based on the Linq2Sql framework) and SqlEntityConnection (based on Entity Framework).

However, here we would like to point out some of the reasons why the alternative approach to query processing offered by the new FSharp.Data.SqlClient is in some circumstances a better solution to the woes of .Net developers. 

The Problem with LINQ - The Dreaded "Unsupported Query Translation" at Runtime

First, a simple observation: StackOverflow and MSDN has hundreds of issues like this one or this one for all kinds of ORM frameworks from NHibernate to LINQ-to-SQL to LINQ-to-Entities. What seems like perfectly valid code fails atrun-time because of an unsupported C#-to-SQL translation semantics. To quote one response to one of these queries:

LINQ to SQL does not know how to translate a call your method 'IsValid' into SQL in order to execute the where clause on the server. Your method does not exist on the server.

or the following runtime error message:

LINQ to Entities does not recognize the method 'Double Sqrt(Double)' method, and this method cannot be translated into a store expression

This is a great example of "leaky abstraction". Further, we all know intuitively that runtime errors like these should be avoided in strongly-typed languages: these conditions are quite "obviously" checkable at compile-time. The lack of control and opaqueness of the C#-to-SQL conversion spells performance problems as well, like infamous "N+1" issue where performance of database queries can be affected based on small changes to code.

Readers are no doubt familiar with a popular blog post The Vietnam of Computer Science by Ted Neward going deep into so-called object-relational impedance mismatch, which is at the core of these issues. So far one industry answer to this was a number of so-called micro-ORMs with a mission of making conversion from database types to .NET objects as simple as possible while refraining from making any assumptions about the actual mapping.

What all of them lack, however, is an ability to verify correctness of SQL queries at compile-time.

And that's where SqlCommandProvider, the core element of the FSharp.Data.SqlClient library, really shines. Essentially, it offers "What You See Is What You Get" for SQL queries. Once F# code involving SqlCommandProvider passes the compilation stage you are guaranteed to have valid executable code for both F# and T-SQL.

Here is a typical snippet of SqlCommandProvider-enabled code:

open FSharp.Data

 

[<Literal>]

let connectionString = @"Data Source=(LocalDb)\v11.0;Initial Catalog=AdventureWorks2012;Integrated Security=True"

 

[<Literal>]

let query = "

   SELECT TOP(@TopN) FirstName, LastName, SalesYTD

   FROM Sales.vSalesPerson

   WHERE CountryRegionName = @regionName AND SalesYTD > @salesMoreThan

   ORDER BY SalesYTD

"

type SalesPersonQuery = SqlCommandProvider<query, connectionString>

let cmd = new SalesPersonQuery()

 

cmd.AsyncExecute(TopN = 3L, regionName = "United States", salesMoreThan = 1000000M)

    |> Async.RunSynchronously

 

//output

seq

[("Pamela", "Ansman-Wolfe", 1352577.1325M);

("David", "Campbell", 1573012.9383M);

("Tete", "Mensa-Annan", 1576562.1966M)]

Now, if there is a typo of syntax, or column names, or if table names are incorrect anywhere in the query, the F# compiler notifies developer immediately. If the query is not legitimate, and would fail basic semantic checking on the database, the F# compiler notifies developer immediately. Likewise, if the database schema changes in ways that make the query incorrect, the F# compiler notifies developer immediately.  

That is, the F# compiler is checking the well-formedness of the SQL syntax at compile-time, through the use of the FSharp.Data.SqlClient type providers. More specifically, FSharp.Data.SqlClient uses features available in MS SQL Server 2012 and SQL Azure to check and compile the SQL query at compile time. As you program in F#, your query is being sent to the database and checked for validity, even as you type! Please see FSharp.Data.SqlClient documentation for more details about how this works.

Note that F# type providers are addins to the F# compiler and the F# development environments, used to provide this checking functionality.

Here is an example of an error being reported because of a problem in an SQL query:

 

As fast as the lightest

In this section, we take a quick look at performance. This is "just a taster" and shouldn't be taken as a full and rigorous analysis.

As we've mentioned, performance is another potential issue for language-integrated queries which rely on query translation via meta-programming (the technique used by LINQ). We'll see this below. However, in order to contrast with "lighter" approaches to the same problem, one comparison we make is with Dapper, an excellent micro-ORM by StackOverflow with a main goal of being extremely fast. It doesn't use the meta-programming techniques of LINQ to the same extent. Here is a description from StackOverflow itself:

Dapper is a micro-ORM, offering core parameterization and materialization services, but (by design) not the full breadth of services that you might expect in a full ORM such as LINQ-to-SQL or Entity Framework. Instead, it focuses on making the materialization as fast as possible, with no overheads from things like identity managers - just "run this query and give me the (typed) data".

Dapper comes with an excellent benchmark. The focus of the test is on deserialization.

Ideally we'd like the "direct" SQL queries above to perform as well as a micro-ORM like Dapper. And indeed, it seems this is the case, at least for simple benchmarks. Here is a summary of how FSharp.Data.SqlClient compares, on one benchmark, based on our tests:

  • Hand coded ADO.NET: 57ms
  • Dapper: 65ms
  • FSharp.Data.SqlCLient: 65ms
  • LINQ2Sql compiled: 92ms
  • LINQ2Sql ExecuteQuery: 215ms
  • Entiry Framework: 202ms

Tests were executed against SqlExpress 2012 so the numbers are a bit higher than what you can see on the Dapper page. The test retrieves a single record with 13 properties by random id 500 times and deserializes it. Trials for different ORMs are mixed up randomly. All executions are synchronous.

Note that we didn't put any specific effort into improving FSharp.Data.SqlClient performance for this test. The very nature of using directly written SQL means FSharp.Data.SqlClient uses a very simple run-time implementation, and is hence almost as close as possible to hand-coded ADO.NET code.

The lesson here is that performance of code that uses direct authoring of compile-time checked SQL syntax to express SQL queries can be very predictable and involve low overhead and translation complexity. FSharp.Data.SqlClient both checks your SQL queries at compile time, in a way that is much more rigorous than other SQL query translation techniques, and gives you a simple and clear path to understanding query performance, basically by avoiding the query translation process altogether.

Pros and Cons

We must keep in mind that FSharp.Data.SqlClient is not strictly an ORM in commonly understood sense of the term. Because of this, it is important to undertstand some of the pros and cons of the techniques we are using. Feel free to list fruther pros and cons in the comment section below!

  • Con: Because result types are auto-generated, FSharp.Data.SqlClient doesn't support so-called multi-mapping
  • Con: FSharp.Data.SqlClient is based on features specific for MS SQL Server 2012 and SQL Azure. Other ORMs can provide much wider range of supported scenarios.
  • Pro: On the other hand, FSharp.Data.SqlClient fully supports SqlServer-specific types like hierarchyId and spatial types
  • Pro: FSharp.Data.SqlClient fully supports User-Defined Table Types for input parameters with no additional coding required
  • Pro: FSharp.Data.SqlClient is a simple encapsulation of the SqlConnection life-cycle including asynchronous scenarios while optionally accepting an external SqlTransaction.

The following FSharp.Data.SqlClient features are also worht highlighting:

  • A reasonable auto-generated result type definition so there is no need to define it in code
  • The SQL command is just a string but FSharp.Data.SqlClient verifies it and figures out the input parameters and output types. The design-time checking gives an unparalled design-time experience
  • Design-time verification means less run-time tests, less yak shaving when synchronizing database schema with code definitions, and earliest possible identification of bugs and mismatches
  • The additional SqlProgrammabilityProvider lets user to explore stored procedures and user-defined functions right from the code with IntelliSense

Conclusion

F# 3.0 Type Providers, which are, in essence, light-weight plugins for F# compiler, dramatically improve developer experience through access to various external data sources with Intellisense and checking at design time.

Combined with the latest features of MS SQL Server, the FSharp.Data.SqlClient library empowers users to write compile time-verified F# and SQL code, with live rechecking in reaction to SQL schema changes, leaving no space for boilerplate while promising performance comparable with the best-of-breed traditional solutions.

This can be viewed as a Domain Specific Language seamlessly integrating SQL and F# in the same codebase.

Please use and contribute to this great F# community library effort at https://fsprojects.github.io/FSharp.Data.SqlClient/.