Converting SQL to LINQ, Part 1: The Basics
Converting SQL to LINQ, Part 2: FROM and SELECT
Converting SQL to LINQ, Part 3: DISTINCT, WHERE, ORDER BY and Operators
Converting SQL to LINQ, Part 4: Functions
Converting SQL to LINQ, Part 5: GROUP BY and HAVING
This post will discuss Cross Join, Inner Join, Natural Join and Outer (Left/Right) Joins.
JOIN
It’s very common to query over more than one set of data (such as a table) in the same SQL SELECT statement. Bringing together the information in multiple tables is called a join, and there are several kinds of joins in both SQL and LINQ.
Cross Join
The simplest join is a Cross Join, or Cartesian Join, which is a many-to-many join between two sets of data. Each record in one set of data is joined to each record in another set. In a SQL SELECT statement, this is done by specifying more than one table in a FROM clause. In a VB LINQ expression, the same is true, as shown below.
SQL |
SELECT CustomerTable.Name, OrderTable.OrderDate FROM CustomerTable, OrderTable |
VB |
From Contact In CustomerTable, Shipment In OrderTable _ Select Contact.Name, Shipment.OrderDate |
Inner Join
An Inner Join is a one-to-one join, where records in one set of data are matched up with records in another set of data based on certain common fields. In a SQL SELECT statement, the second set of data is specified in an INNER JOIN clause, and the equalities used to join them are specified in an ON clause. Similarly, the second set of data in a VB LINQ Join is specified in a Join clause, and an On clause is used to specify which field to match up with the Equals operator.
SQL |
SELECT Contact.Name, Shipment.OrderID FROM CustomerTable Contact INNER JOIN OrderTable Shipment ON Contact.CustomerID = Shipment.CustomerID AND Contact.Zip = Shipment.ShippingZip |
VB |
From Contact In CustomerTable Join Shipment In OrderTable _ On Contact.CustomerID Equals Shipment.CustomerID _ And Contact.Zip Equals Shipment.ShippingZip _ Select Contact.Name, Shipment.OrderID |
The above example is an Equi-Join, meaning an equality operator is used to join information between the tables. This is the only operator allowed in the On clause, so to emulate any other join operators (such as less-than), you will need to use a Cross Join filtered with a Where clause.
SQL |
SELECT Contact.Name, Shipment.OrderID FROM CustomerTable Contact INNER JOIN OrderTable Shipment ON Contact.CustomerID < Shipment.CustomerID |
VB |
From Contact In CustomerTable, Shipment In OrderTable _ Where Contact.CustomerID < Shipment.CustomerID Select Contact.Name, Shipment.OrderID |
Natural Join
A Natural Join is a one-to-one join where records in one set of data are matched up with records in another set of data based on all common fields (determined by matching names). In a SQL SELECT statement, the second set of data can be specified in a NATURAL JOIN clause, and the equalities used to join the tables are implicit. There is no direct equivalent to a natural join in VB LINQ expressions, so the best way to emulate it is to create an inner join and specify all common field equalities in the On clause manually. This is more verbose than the SQL version, but should be pretty straightforward.
SQL |
SELECT * FROM CustomerTable NATURAL JOIN OrderTable |
VB |
From Contact In CustomerTable _ Join Shipment In OrderTable _ On Contact.CustomerID Equals Shipment.CustomerID _ |
Outer (Left/Right) Join
An Outer Join (also known as a Left Join or Right Join) is a one-to-many join, where each record in one set of data can be matched up with multiple records in another set of data, based on common fields. In a SQL SELECT statement, the second set of data is specified in a LEFT JOIN or RIGHT JOIN clause, and the equalities used to join them are specified in an ON clause. In a LEFT JOIN, every record in the first (left) set of data is joined with all records in the second set of data that match it based on the join expressions. Every record in this first set of data will appear in the result, whether or not it matches anything in the second set. This is reversed in a RIGHT JOIN, where everything in the second (right) set of data appears and is matched to everything possible in the first set.
The closest VB LINQ construct to an Outer Join would be a Group Join. A Group Join clause specifies a second set of data, and provides the equality expressions in the On clause, much like a Join (Inner Join), described above. There is also an Into clause, which can be used to specify aggregates to calculate over each group, much like in a Group Join clause above.
Similar to a SQL LEFT JOIN, each member of the first set of data is matched with everything that matches it in the second set of data. Again, if there is no match for an item in the first set of data, it will still appear in the results.
]]>
Just this week someone gave me this feedback:
“I am having a difficult time finding information to help me adapt to LINQ in my webforms. I am hoping that someone from the team can offer a direction. “
You bet! We have a lot of attention to LINQ and Forms-over-data in VB in our learning content, but we’re lacking content with attention to LINQ and Web-based Forms-over-data.
There are a number of great innovations in the Web space that make it easy to leverage LINQ’s querying capabilities and mesh that with the richness and flexibility of Web-based UI. It’s easy.
There are a few concepts you should learn or know about to get started:
The first post will cover the simplest Web form that will demonstrate this end to end span of concepts. I’ll then take input from you to add features and expand on this sample in future posts.
Here’s a sample of the initial desired Web form output – a very simple Employee list report for an HR application – admittedly it’s bare bones and in need of UI design love:
EmployeeID: 2
Andrew Fuller
Andrew received his BTS commercial in 1974 and a Ph.D. in international marketing from the University of Dallas in 1981. He is fluent in French and Italian and reads German. He joined the company as a sales representative, was promoted to sales manager in January 1992 and to vice president of sales in March 1993. Andrew is a member of the Sales Management Roundtable, the Seattle Chamber of Commerce, and the Pacific Rim Importers Association.
EmployeeID: 1
Nancy Davolio
Education includes a BA in psychology from Colorado State University in 1970. She also completed “The Art of the Cold Call.” Nancy is a member of Toastmasters International.
If you are pretty familiar with Visual Studio I expect this will take about 10-15 minutes to walk through on your own machine.
– make sure you have Visual Studio 2008 or Visual Web Developer 2008 Express installed. Express Editions are available for download here.
– make sure you have SQL Server 2005 Express installed and running. It typically gets installed by default with VS or VS Express, however, you can also install it from here.
-if you don’t have Northwind.mdf on your machine, download it from the attached files on this post.
First we’re going to add the database to our project and create necessary classes to use the database in LINQ (using Linq to SQL). Note, you could choose to skip this step and create your own custom LINQ query to any other data source.
Now that we’ve added the database to the Web site and created necessary LINQ to SQL classes, it’s time to create some UI and wire up the data to the UI. I won’t do anything too fancy – just repeat data bound labels in a DataList (you could replace this with GridView, FormView or choose your favorite control).
The LinqDataSource needs to be wired up to the underlying LINQ data. In this case data is provided by any DataContext object or query over an object of the NorthwindDataClasses type (a LINQ to SQL file). The DataList then simply needs to use the LinqDataSource as the data source, and display data from bound fields using traditional Eval(“”) statements. The designer will generate a set of defaults for you if you use the Smart Tags. Let’s give it a shot with defaults, and then we can fine tune it from there.
Here is the resulting mark up for the LinqDataSource as seen in “Source” view. Note how the ContextTypeName=”NorthwindDataClassesDataContext” — this matches the name of the type we created in the designer (the code gen appends “DataContext” to the end — whew that’s long!). The TableName is set to the Table or Property we want to display — Employees in this case. You could use this pattern to bind to any DataContext type and class property within.
<asp:LinqDataSource ID="LinqDataSource1" runat="server" ContextTypeName="NorthwindDataClassesDataContext" TableName="Employees"> </asp:LinqDataSource>
Now let’s wire up the DataList to the LinqDataSource. The nice thing here is the DataList only cares about the field names returned from the LINQ query via the LinqDataSource — this shields you as you continue to refine the fields and row results of your dynamic LINQ queries.
The wizard will create a series of bound labels by default. You can use the designer in edit mode to get the specific fields, look and feel you want. TIP: This is typically where I drop in the html mark up “Source” view. In my example, I’ll simply show the full name, notes, and have a place holder to show an employee photo image. I also format the data in a two column table.
If you look at the DataList mark up, you see the ItemTemplate contains a number of bound labels. Binding to data from the query is simply a matter of typing in <%# Eval(“YOURFIELDNAME”) %> in your server control fields using classic ASP-style & VB data binding eval statements. This gives you a lot of flexibility to display just the data you want, formatted how you want. Here’s my customization:
<div> <asp:DataList ID="DataList1" runat="server" DataKeyField="EmployeeID" DataSourceID="LinqDataSource1"> <HeaderTemplate> <table> </HeaderTemplate> <ItemTemplate> <tr> <td> <img src="PLACEHOLDER.jpg" class="" style="border: 4px solid white"
alt='Photo Number XXX' /> <br /><br /> </td> <td> EmployeeID: <asp:Label ID="EmployeeIDLabel" runat="server" Text='<%# Eval("EmployeeID") %>' /> <br /> <asp:Label ID="LastNameLabel" runat="server"
Text='<%# Eval("FirstName") & " " & Eval("LastName") %>' /> <br /> <asp:Label ID="NotesLabel" runat="server" Text='<%# Eval("Notes") %>' /> </td> </tr> </ItemTemplate> <FooterTemplate> </table> </FooterTemplate> </asp:DataList> </div>
The technique above provides a zero-code method for binding Web forms to LINQ data. However, you might be asking yourself, where is the LINQ and VB here? It’s true the LinqDataSource hides all the querying code and constrains what you can do via the designer. This is fine for some simple cases, but the real power of LINQ is being able to use your own free form queries and VB logic. The good news is LinqDataSource supports using your own LINQ queries using the LinqDataSource.Selecting event. This is your hook to tell the control exactly what query should be used. It’s easy. Here’s how:
In my example I will query over Northwind employees, define a custom expression column called FullName (=First + Last), filter by full name starting with A or N, and sort by last name. Note I’m creating aliases for each property and hence the query is now an anonymous type — and the schema is now custom vs. my default NorthwindDataContext with all (*) columns. TIP: if you’re getting runtime errors when you try this make sure all the fields expected by your Web form are there — e.g. if the form is expecting “EmployeeID” make sure your query Selects it.
Protected Sub LinqDataSource1_Selecting(ByVal sender As Object, ByVal e _ As System.Web.UI.WebControls.LinqDataSourceSelectEventArgs) _ Handles LinqDataSource1.Selecting Dim northwind As New NorthwindDataClassesDataContext 'custom anonymous LINQ query Dim query = From emp In northwind.Employees _ Select emp.EmployeeID, emp.FirstName, emp.LastName, emp.Notes, _ FullName = emp.FirstName & " " & emp.LastName _ Where FirstName.ToUpper.StartsWith("A") Or FirstName.ToUpper.StartsWith("N") _ Order By FullName 'sets LinqDataSource query equal to custom query. 'use data binding expressions to look up aliased fields above. e.Result = query End Sub
You can run the app again and see the custom query is working — only two rows are returned — Andrew Fuller and Nancy Davolio. We could also tweak the markup to make use of our new aliased expression column — “FullName”. Here’s how that would look in the DataList:
<asp:DataList ID="DataList1" runat="server" DataKeyField="EmployeeID" DataSourceID="LinqDataSource1"> <HeaderTemplate> <table> </HeaderTemplate> <ItemTemplate> <tr> <td> <img src="PLACEHOLDER.jpg" class="" style="border: 4px solid white" alt='Photo Number XXX' /> <br /><br /> </td> <td> EmployeeID: <asp:Label ID="EmployeeIDLabel" runat="server" Text='<%# Eval("EmployeeID") %>' /> <br /> <asp:Label ID="LastNameLabel" runat="server" Text='<%# Eval("FullName") %>' /> <br /> <asp:Label ID="NotesLabel" runat="server" Text='<%# Eval("Notes") %>' /> </td> </tr> </ItemTemplate> <FooterTemplate> </table> </FooterTemplate> </asp:DataList>
To wrap things up, it’s easy to map what you know about Web Forms to what you’re learning about Linq in VB. The LinqDataSource connects your underlying Linq to SQL DataContext, or any generalized Linq query, to the rest of your Web Form. You can create totally custom queries in the LinqDataSource.Selecting event and pass that to the control via e.Result. And then you can get at any field or property in your Linq query using classic Eval(“MYFIELD”) data binding expressions.
What do you think? What additions would you like to see?
If you want more info now, ScottGu put together an awesome series of Linq to SQL posts for the Web, and he included VB sample code. Another great portal for learning and How To content is www.asp.net.
Best,
Paul
———————————-
Paul Yuknewicz
Lead Program Manager
Microsoft Visual Studio
http://msdn.com/vbasic/
You can view the full video and sample code here on Channel9 (Thanks Jeff and C9 team!):
https://channel9.msdn.com/ShowPost.aspx?PostID=367997%20%20
With love,
Microsoft Friends of VB
]]>, Principal Developer (currently working with Erik Meijer), where he attempts to teach me higher algebra using Visual Basic, generics, and operator overloading. Brian is a wonderful person and brilliant physicist and we have a lot of fun with vectors and matrices and VB. I actually think I understood some of what Brian showed me ;).
Visual Basic is a great language for mathematics as well as all kinds of other applications. Brian makes the point that he has fun coding in VB because of its intuitive style and how easy it is to be immediately productive. Check out Brian’s blog post on the VB Team blog! And for all you abstract algebra aficionados, here’s the code to play with.
Enjoy,
–Beth Massi, VS Community
Operator overloads with Generics enable some beautiful designs for data types in Higher Algebra, a branch of mathematics, sometimes called Abstract Algebra. Consider fields and vector spaces. I’ll show you operator overloads at THREE levels in a single design. First, background: In this context, a “field” is a collection of unspecified objects, closed under two associative operators, + and *, that obey the distributive law.
Closed means that for any a, b, and c in the field, a+b and a*c are in the field. Associative means
a + (b + c) = (a + b) + c
a * (b * c) = (a * b) * c
Distributive means
a * (b + c) = a * b + a * c
(b + c) * a = b * a + c * a
The field must also have two special members: the additive unit 0 and the multiplicative unit 1, where
a + 0 = 0 + a = a
a * 1 = 1 * a = a
and must have for every a, an additive inverse, -a, such that a + -a = -a + a = 0. There must also be a multiplicative inverse for every element except 0, written 1/a, such that a * 1/a = 1/a * a = 1.
Some authors insist on the commutative laws (a + b = b + a, a * b = b * a), too, but we don’t, here. The most common examples of fields are the Rationals, the Real numbers, the Complex numbers, all of which have commutative addition and multiplication; and the Quaternions, which have non-commutative multiplication.
Don’t confuse the mathematical “field” with a “field” in a record, structure or class.
In each instance of a field, we define + and * to do anything we want so long as they obey the associative and distributive laws. A “vector space over a field” is the set of n-tuples or vectors built up from members of the field and closed under an additional linear combination law, written with no operator symbol, or sometimes with *. If v and w are any two vectors, and f and g are any two members of the underlying field, then f v + g w is a vector, and, furthermore,
f (v + w) = f v + f w
(v + w) g = v g + w g
(f + g) v = f v + g v
w (f + g) = w f + w g
Vector spaces are central in physics and simulation. Imagine 6-vectors of real numbers; such things represent particle states in “phase space” in classical mechanics. Imagine 4-vectors of complex numbers; such things appear in quantum mechanics. Similar structures occur all over Quantum Theory and Gravitation
References:
Applications for vectors of quaternions are not so easy to come by, but, they are perfectly well defined, and, if we do our software design right, they “just work.” Ditto for vectors over a purely symbolic field or over any other kind. It’s not difficult to support field-like algebraic structures with non-associative multiplication within the same software design. Such things include the Cayley numbers or octonions, but, because of non-associativity, they don’t mesh easily with linear algebra, and that’s where we want to go. Stop with the quaternions.
We extend the design to inner-product spaces, in which every vector has a dual, and to linear algebras: sets of linear transformations of vectors, realized as matrices. With just a little code, we build a general, extensible, optimizable library suitable for physics, engineering, and mathematics in any finite-dimensional vector spaces.
We want three layers:
The underlying types comprise built-ins like Double, and custom types like Rational, Complex, Quaternion, and Symbol. Underlying types should implement the basic field operations, but can have many more operations for convenience, like optimized division routines. Operator overloading is a no-brainer for the underlying types.
At the top level of the linear-algebra types, operator overloads are again a no-brainer for vector + vector, matrix * vector, vector * matrix, matrix * matrix, vector * vector, and so on.
At the middle layer, the field types abstract and narrow the operations down to the bare minimum so that the linear-algebra types know what and only what to expect. In this layer, operator overloading is not quite a no-brainer, and there are a couple of different ways to go with a design.
The first paragraph showed the field axioms, and these are what we must design into our field types. One way to do that is to impose the field operations by interface:
Public Interface IField(Of U) 'U is the "UNDERLYING" type 'Get the underlying value. Property UValue() As U 'Generic users of IField type shouldn't be able to tell 'whether the Underlying type U has copy-reference 'semantics or copy-value semantics, so let's insist that 'providers of IField implement a Dup operation. Function Dup() As IField(Of U) 'This is what fields do (would be nice to have a contract 'for the distributive law). Must use a design like this 'since Interfaces cannot host Shared functions in general, 'and operator overloads in particular. Function Add(ByVal that As IField(Of U)) As IField(Of U) Function Mul(ByVal that As IField(Of U)) As IField(Of U) Function AdditiveInverse(ByVal that As IField(Of U)) _ As IField(Of U) Function MultiplicativeInverse(ByVal that As IField(Of U)) _ As IField(Of U) 'Every field must have these. Implement them as Shared and 'even Const if possible. ReadOnly Property Zero() As IField(Of U) ReadOnly Property Unity() As IField(Of U) 'The following is an auxiliary operation for the Normed 'Division Algebras to support Inner Product, or '"Sesquilinear mappings." This computes the complex 'conjugate, the quaternion conjugate, and so on. If the 'underlying field does not have a natural dual field, then 'it's probably self-dual and just implement "Dual" as "Dup." Function Dual() As IField(Of U) End Interface
An upside of this design is that both Structure types and Class types can implement the interface. Thus, the linear-algebra classes are independent of differences of copy semantics. Programmers may use Structure types for run-time speed at both the underlying-type level and at the field level.
Here’s an example of a Structure implementing a field of Doubles
Public Structure SFDouble Implements IField(Of Double) ... Private mValue As Double Public Property UValue() As Double Implements IField(Of Double).UValue Get Return mValue End Get Set(ByVal value As Double) mValue = value End Set End Property Public Sub New(ByVal d As Double) mValue = d End Sub ... Public Function Add(ByVal that As IField(Of Double)) As IField(Of Double) _ Implements IField(Of Double).Add Return New SFDouble(Me.UValue + that.UValue) End Function Public Function Dual() As IField(Of Double) Implements IField(Of Double).Dual Return New SFDouble(Me.UValue) End Function Public Function Dup() As IField(Of Double) Implements IField(Of Double).Dup Return New SFDouble(Me.UValue) End Function Public Function AdditiveInverse(ByVal that As IField(Of Double)) As IField(Of Double) _ Implements IField(Of Double).AdditiveInverse 'Let the underlying math throw exception here if that.UValue == 0 Return New FDouble(-Me.UValue) End Function Public Function MultiplicativeInverse(ByVal that As IField(Of Double)) As IField(Of Double) _ Implements IField(Of Double).MultiplicativeInverse 'Let the underlying math throw exception here if that.UValue == 0 Return New FDouble(1 / Me.UValue) End Function Public Function Mul(ByVal that As IField(Of Double)) As IField(Of Double) _ Implements IField(Of Double).Mul Return New SFDouble(Me.UValue * that.UValue) End Function Private Shared sUnity = New SFDouble(1.0) Private Shared sZero = New SFDouble(0.0) Public ReadOnly Property Unity() As IField(Of Double) Implements IField(Of Double).Unity Get Return sUnity End Get End Property Public ReadOnly Property Zero() As IField(Of Double) Implements IField(Of Double).Zero Get Return sZero End Get End Property ... End Structure
A Big Downside though, for the purposes of this exercise, is no operator overloads at the field level, because operator overloads
are Shared by definition (i.e., static) and interfaces can’t support static methods (the reason is that static virtual methods don’t make sense in .NET, and all methods in an interface are virtual).
But, we want operator overloads at the field level, and we can get them through an alternative base-class design for fields. We lose Structures at the field level, since particular fields must inherit from a base Class, but the underlying types can still be structures.
Public MustInherit Class AField(Of U) 'U is the "UNDERLYING" type ... Private mUValue As U Public Property UValue() As U Get Return mUValue End Get Set(ByVal value As U) mUValue = value End Set End Property 'Now, our operator overloads just call virtual MustOverride Functions. 'Magically, these are NON-COMMUTATIVE in general, just as we want. Public Shared Operator +(ByVal e1 As AField(Of U), _ ByVal e2 As AField(Of U)) As AField(Of U) Return e1.Add(e2) End Operator Public Shared Operator -(ByVal e1 As AField(Of U), _ ByVal e2 As AField(Of U)) As AField(Of U) Return e1.Add(e2.AdditiveInverse()) End Operator Public Shared Operator *(ByVal e1 As AField(Of U), _ ByVal e2 As AField(Of U)) As AField(Of U) Return e1.Mul(e2) End Operator Public Shared Operator /(ByVal e1 As AField(Of U), _ ByVal e2 As AField(Of U)) As AField(Of U) Return e1.Mul(e2.MultiplicativeInverse()) End Operator Public Shared Operator -(ByVal that As AField(Of U)) As AField(Of U) Return that.AdditiveInverse() End Operator Public Shared Operator Not(ByVal that As AField(Of U)) As AField(Of U) Return that.Dual() End Operator 'This is what fields do (would be nice to have a contract for the 'distributive law). MustOverride Function Add(ByVal that As AField(Of U)) As AField(Of U) MustOverride Function Mul(ByVal that As AField(Of U)) As AField(Of U) MustOverride Function AdditiveInverse() As AField(Of U) MustOverride Function MultiplicativeInverse() As AField(Of U) 'Every field must have these. Implement them as Shared and even Const if 'possible. MustOverride ReadOnly Property Zero() As AField(Of U) MustOverride ReadOnly Property Unity() As AField(Of U) 'Generic users of IField type shouldn't be able to tell 'whether the Underlying type U has copy-reference 'semantics or copy-value semantics, so insist that 'subclasses of AField implement a Dup operation. MustOverride Function Dup() As AField(Of U) 'The following is an auxiliary operation for the Normed 'Division Algebras to support Inner Product, or '"Sesquilinear mappings." This computes the complex 'conjugate, the quaternion conjugate, and so on. If the 'underlying field does not have a natural dual field, then 'it's probably self-dual and just implement "Dual" as "Dup." MustOverride Function Dual() As AField(Of U) End Class
Now we have operator overloads at the field level, but they’re not virtual (they can’t be). What they do is statically dispatch to
virtual ADD and MUL methods, which are just like the ones we had in the interface design for the field level. Nice, eh? Here are the implementations for the field of Complexes, which are implemented in an underlying Structure type:
Public Class AComplex Inherits AField(Of Complex) ...
Public Overrides Function Add(ByVal that As AField(Of Complex)) As AField(Of Complex) Return New AComplex(Me.UValue + that.UValue) End Function Public Overrides Function AdditiveInverse() As AField(Of Complex) Return New AComplex(-Me.UValue.R, -Me.UValue.I) End Function Public Overrides Function Dual() As AField(Of Complex) Return New AComplex(Not Me.UValue) End Function Public Overrides Function Dup() As AField(Of Complex) Return New AComplex(Me) End Function Public Overrides Function MultiplicativeInverse() As AField(Of Complex) Return New AComplex(1.0 / Me.UValue) End Function Public Overrides Function Mul(ByVal that As AField(Of Complex)) As AField(Of Complex) Return New AComplex(Me.UValue * that.UValue) End Function ...
Public Overrides ReadOnly Property Unity() As AField(Of Complex) Get Return New AComplex(1.0, 0.0) End Get End Property Public Overrides ReadOnly Property Zero() As AField(Of Complex) Get Return New AComplex(0.0, 0.0) End Get End Property Public Overrides Function ToString() As String Return UValue.ToString() End Function End Class
And here is its underlying type:
Imports System.Text Public Structure Complex Private mR As Double Public Property R() As Double Get Return mR End Get Set(ByVal value As Double) mR = value End Set End Property Private mI As Double Public Property I() As Double Get Return mI End Get Set(ByVal value As Double) mI = value End Set End Property ... Public Shared Operator -(ByVal that As Complex) As Complex Return New Complex(-that.R, -that.I) End Operator Public Shared Operator +(ByVal c1 As Complex, ByVal c2 As Complex) As Complex Dim result = New Complex() result.R = c1.R + c2.R result.I = c1.I + c2.I Return result End Operator Public Shared Operator *(ByVal c1 As Complex, ByVal c2 As Complex) As Complex Dim result = New Complex() result.R = (c1.R * c2.R) - (c1.I * c2.I) result.I = (c1.R * c2.I) + (c1.I * c2.R) Return result End Operator Public Shared Operator Not(ByVal c1 As Complex) As Complex Return New Complex(c1.R, -c1.I) End Operator ...
Public Shared Operator *(ByVal c As Complex, ByVal scalar As Double) As Complex Return New Complex(c.R * scalar, c.I * scalar) End Operator Public Shared Operator /(ByVal c As Complex, ByVal scalar As Double) As Complex Return New Complex(c.R / scalar, c.I / scalar) End Operator Public Shared Operator /(ByVal c1 As Complex, ByVal c2 As Complex) As Complex Return c1 * Not c2 / c2.MagnitudeSquared() End Operator ...
End Structure
Now, we have an example of a Field generic on an underlying type, How about the Vector space? I wrote one vector space generic
on the interface-style Field design, and another vector space generic on the base-class Field design. Here is the latter one, it’s prettier:
Imports System.Text 'The only reason I must mention the underlying type, U, here, is to pass it to the 'Generic AField interface. I don't use U anywhere explicitly inside this. If I had '"Monads" in the Generic type language, then I could thread U through this via a '"bind" operator, very pretty, but dreamland. Public Class ANVector(Of U, T As {New, AField(Of U)}) Private mDimension As Integer Public ReadOnly Property Dimension() As Integer Get Return mDimension End Get End Property Private mComponents() As T Default Property Component(ByVal i As Integer) As T Get Component = mComponents(i) End Get Set(ByVal Value As T) mComponents(i) = Value End Set End Property ...
Public Sub New(ByVal that As ANVector(Of U, T)) mDimension = that.Dimension ReDim mComponents(Dimension) For i = 1 To Dimension 'Cannot call New with parameters on Generic types ... Sorry) ' --> Me.mComponents(i) = New T(that(i)) Me.mComponents(i) = that(i).Dup() Next End Sub 'Add two vectors. PROPERLY, vectors should themselves implement IModule, etc. Some day... Public Shared Operator +(ByVal V1 As ANVector(Of U, T), ByVal V2 As ANVector(Of U, T)) _ As ANVector(Of U, T) If V1.Dimension <> V2.Dimension Then Throw New VectorSpaceException("Cannot add vectors of different dimensions") Else Dim result = New ANVector(Of U, T)(V1) 'clone For i = 1 To result.Dimension result(i) = result(i) + V2(i)''' OPERATOR OVERLOADS BEING USED Next Return result End If End Operator 'Scale a vector Public Shared Operator *(ByVal that As ANVector(Of U, T), ByVal scalar As T) _ As ANVector(Of U, T) Dim result = New ANVector(Of U, T)(that) For i = 1 To result.Dimension result(i) = result(i) * scalar''' OPERATOR OVERLOADS BEING USED Next Return result End Operator Public Shared Operator *(ByVal scalar As T, ByVal that As ANVector(Of U, T)) _ As ANVector(Of U, T) Return that * scalar''' OPERATOR OVERLOADS BEING USED End Operator Public Function Dual() Dim result = New ANVector(Of U, T)(Me.Dimension) For i = 1 To result.Dimension result(i) = Not result(i)''' OPERATOR OVERLOADS BEING USED Next Return result End Function Public Shared Operator Not(ByVal V As ANVector(Of U, T)) As ANVector(Of U, T) Return V.Dual() End Operator 'Inner product Public Shared Operator *(ByVal Left As ANVector(Of U, T), ByVal Right As ANVector(Of U, T)) _ As T If Left.Dimension <> Right.Dimension Then Throw New VectorSpaceException _ ("Cannot compute inner product of vectors with different dimensions") Else Dim result As New T() Dim temp As New T() For i = 1 To Left.Dimension result = result + Left(i) * Not Right(i)''' OPERATOR OVERLOADS BEING USED Next Return result End If End Operator ... End Class '================================================================
Imports System.Text Public Class AMNMatrix(Of U, T As {New, AField(Of U)}) Private mRowCount As Integer Public ReadOnly Property Rows() As Integer Get Return mRowCount End Get End Property Private mColumnCount As Integer Public ReadOnly Property Columns() As Integer Get Return mColumnCount End Get End Property Private mElements As T(,) Default Property Element(ByVal i As Integer, ByVal j As Integer) As T Get Return mElements(i, j) End Get Set(ByVal Value As T) mElements(i, j) = Value End Set End Property ... 'Scale a matrix Public Shared Operator *(ByVal scalar As T, ByVal that As AMNMatrix(Of U, T)) _ As AMNMatrix(Of U, T) Dim result = New AMNMatrix(Of U, T)(that) For i = 1 To that.Rows For j = 1 To that.Columns result(i, j) = result(i, j) * scalar ''' OPERATOR OVERLOADS BEING USED Next Next Return result End Operator 'Scale a matrix Public Shared Operator *(ByVal that As AMNMatrix(Of U, T), ByVal scalar As T) _ As AMNMatrix(Of U, T) Return scalar * that''' OPERATOR OVERLOADS BEING USED End Operator 'Matrix plus matrix
Public Shared Operator +(ByVal M1 As AMNMatrix(Of U, T), ByVal M2 As AMNMatrix(Of U, T)) _ As AMNMatrix(Of U, T) If M1.Rows <> M2.Rows Or M1.Columns <> M2.Columns Then Throw New VectorSpaceException("Cannot add matrices of different dimensions") Else Dim result = New AMNMatrix(Of U, T)(M1) For i = 1 To M1.Rows For j = 1 To M1.Columns result(i, j) = result(i, j) + M2(i, j) ''' OPERATOR OVERLOADS BEING USED Next Next Return result End If End Operator 'Matrix times a vector Public Shared Operator *(ByVal m As AMNMatrix(Of U, T), ByVal v As ANVector(Of U, T)) _ As ANVector(Of U, T) If m.Columns <> v.Dimension Then Throw New VectorSpaceException("The number of columns in the matrix must equal " & _ "the dimension of the vector") Else Dim result = New ANVector(Of U, T)(m.Rows) For i = 1 To m.Rows For j = 1 To m.Columns result(i) = result(i) + v(j) * m(i, j) ''' OPERATOR OVERLOADS BEING USED Next Next Return result End If End Operator 'Vector times a Matrix Public Shared Operator *(ByVal v As ANVector(Of U, T), ByVal m As AMNMatrix(Of U, T)) _ As ANVector(Of U, T) If m.Rows <> v.Dimension Then Throw New VectorSpaceException("The dimension of the vector must equal" & _ "the number of rows in the matrix") Else Dim result = New ANVector(Of U, T)(m.Columns) For j = 1 To m.Columns For i = 1 To m.Rows result(j) = result(j) + v(i) * m(i, j) ''' OPERATOR OVERLOADS BEING USED Next Next Return result End If End Operator 'Matrix times Matrix Public Shared Operator *(ByVal left As AMNMatrix(Of U, T), ByVal right As AMNMatrix(Of U, T)) _ As AMNMatrix(Of U, T) If left.Columns <> right.Rows Then Throw New VectorSpaceException("The number of columns in the left matrix must equal" & _ "The number of rows in the right matrix") Else Dim result = New AMNMatrix(Of U, T)(left.Rows, right.Columns) Dim inner = left.Columns For i = 1 To result.Rows For j = 1 To result.Columns For k = 1 To inner ''' OPERATOR OVERLOADS BEING USED result(i, j) = result(i, j) + left(i, k) * right(k, j) Next Next Next Return result End If End Operator ... End Class
There we go: operator overloading at three levels! I’ve uploaded a zipped Visual-Studio 2008 project to my Public folder with all this code.
Play around, let me know what you think.
Converting SQL to LINQ, Part 1: The Basics
Converting SQL to LINQ, Part 2: FROM and SELECT
Converting SQL to LINQ, Part 3: DISTINCT, WHERE, ORDER BY and Operators
Converting SQL to LINQ, Part 4: Functions
This post will discuss the GROUP BY and HAVING clauses.
GROUP BY
A SQL GROUP BY clause allows you to group records by particular fields, so the entire group can be dealt with at once. A LINQ statement can have a Group By clause as well, but with different syntax. An informal (and incomplete) syntax expression could be:
VB Group By |
Group [{optional list of fields}] By {list of fields} _ Into {list of aggregate expressions} |
In this syntax, all the listed expressions can have aliases, and must if an identifier cannot be inferred. Essentially, the clause is “Group By” followed by a list of fields by which to group the records, then “Into” followed by a list of aggregate expressions to calculate. For example, the following query calculates the total and average cost of shipments to each zip code:
VB |
From Shipment In OrderTable _ Group By Shipment.ShippingZip _ Into Total = Sum(Shipment.Cost), Average(Shipment.Cost) |
This query expression returns three values, ShippingZip, Total and Average, for each zip code represented in the data.
Optionally, a list of fields can be provided between the “Group” and “By” keywords to narrow down the information included to specific fields. This could be thought of as a built-in Select clause prior to the Into clause:
VB |
From Shipment In OrderTable _ Group OrderCost = Shipment.Cost By Shipment.ShippingZip _ Into Total = Sum(OrderCost), Average(OrderCost) |
In addition to aggregate expressions, the Into clause can also contain the keyword “Group”, which causes the individual records to be included for each group, as an array member. For each pairing of zip code and order date represented in the data, the following query expression returns two values, Zip and OrderDate, plus an array representing all the records in the group.
VB |
From Shipment In OrderTable _ Group By Zip = Shipment.ShippingZip, Shipment.OrderDate _ Into Group |
When there are no fields specified between “Group” and “Order By”, as above, all fields are included in the records in the Group array. Including specific fields narrows the information to the specified fields, as though the records were filtered through a Select clause. For each zip code and order date pairing represented in the data, the following query expression returns Zip and OrderDate values, plus an array of ID and Cost fields for each record in the group.
VB |
From Shipment In OrderTable _ Group ID = Shipment.OrderID, Shipment.Cost By _ Zip = Shipment.ShippingZip, Shipment.OrderDate _ Into Group |
I’ve given a lot of examples because the syntax is fairly complex, but now we start to see how to convert various SQL statements with GROUP BY clauses to VB. Below is an example:
SQL |
SELECT OrderDate Date_Of_Order, SUM(Cost) Daily_Total FROM OrderTable GROUP BY Date_Of_Order |
VB |
From Shipment In OrderTable _ Group By Date_Of_Order = Shipment.OrderDate _ Into Daily_Total = Sum(Shipment.Cost) |
Having
Having is another SQL clause which can specify conditions for a group’s inclusion in the query results. VB has no corresponding clause, so the best way to re-create this with VB LINQ is to use a Where clause after a Group By clause, as shown below.
SQL |
SELECT OrderDate Date_Of_Order, SUM(Cost) Total_Cost FROM OrderTable GROUP BY Date_Of_Order HAVING SUM(Cost) > 1000 |
VB |
From Shipment In OrderTable _ Group By Date_Of_Order = Shipment.OrderDate _ Into Total_Cost = Sum(Shipment.Cost) Where Total_Cost > 1000 |
In my next post, I plan to cover various kinds of joins.
– Bill Horst, VB IDE Test
]]>
· Visual Studio 2008 (Beta2 or Higher)
Categories: LINQ to Objects
Introduction:
LINQ Cookbook, Recipe 11 showed how you can use LINQ queries to perform calculations on sets of data using a set of standard aggregate functions such as Average, and Sum. In this recipe, you will learn how to add an extension method so that you can include your own custom aggregate function in a LINQ query.
This recipe adds two extension methods: StDev (standard deviation) and StDevP (standard deviation for the entire population). Because the extension methods are added to the IEnumerable(Of T) type, you can use the custom aggregate functions in the Into clause of an Aggregate, Group By, or Group Join query clause. Notice that there are two overloads of each extension method: one that takes input values of type IEnumerable(Of Double), and another that takes input values of type IEnumerable(Of T). This enables you to call the custom aggregate functions whether your LINQ query returns a collection of type Double, or any other numeric type. The overload that takes input values of type IEnumerable(Of T) uses the Func(Of T, Double) lambda expression to project a the numeric values as the corresponding values of type Double before calculating the standard deviation. When calculating the standard deviation for values of type Double, you can simply call the StDev() or StDevP() overloads. When calculating the standard deviation for values of numeric types other than Double, you need to pass the value to the StDev(value) or StDevP(value) overloads to ensure that the value is projected as type Double.
Instructions:
· Create a Console Application.
· After the End Module statement of the default Module1 module, add the following class, which contains both the StDev and StDevP functions.
Class StatisticalFunctions
Public Shared Function StDev(ByVal values As Double()) As Double
Return CalculateStDev(values, False)
End Function
Public Shared Function StDevP(ByVal values As Double()) As Double
Return CalculateStDev(values, True)
End Function
Private Shared Function CalculateStDev(ByVal values As Double(), _
ByVal entirePopulation As Boolean) As Double
Dim count As Integer = 0
Dim var As Double = 0
Dim prec As Double = 0
Dim dSum As Double = 0
Dim sqrSum As Double = 0
Dim adjustment As Integer = 1
If entirePopulation Then adjustment = 0
For Each val As Double In values
dSum += val
sqrSum += val * val
count += 1
Next
If count > 1 Then
var = count * sqrSum – (dSum * dSum)
prec = var / (dSum * dSum)
‘ Double is only guaranteed for 15 digits. A difference
‘ with a result less than 0.000000000000001 will be considered zero.
If prec < 0.000000000000001 OrElse var < 0 Then
var = 0
Else
var = var / (count * (count – adjustment))
End If
Return Math.Sqrt(var)
End If
Return No
thing
End Function
End Class
· After the StatisticalFunctions class, add the following module to add the extension methods to IEnumerable to calculate the standard deviation for both IEnumerable(Of Double) and IEnumerable(Of T).
Module StatisticalAggregates
‘ Calculate the stdev value for a collection of type Double.
<Extension()> _
Function StDev(ByVal stDevAggregate As IEnumerable(Of Double)) As Double
Return StatisticalFunctions.StDev(stDevAggregate.ToArray())
End Function
‘ Project the collection of generic items as type Double and calculate the stdev value.
<Extension()> _
Function StDev(Of T)(ByVal stDevAggregate As IEnumerable(Of T), _
ByVal selector As Func(Of T, Double)) As Double
Dim values = (From element In stDevAggregate Select selector(element)).ToArray()
Return StatisticalFunctions.StDev(values)
]]>Enjoy,
–Beth Massi, VS Community
Get started with LINQ to XML in Visual Basic with these How-to Videos.
Enjoy,
–Beth Massi, VS Community
· Visual Studio 2008 (Beta2 or Higher)
Categories: LINQ to DataSet
Introduction:
You can use aggregate functions in LINQ queries to perform calculations on sets of data. Visual Basic includes a set of standard aggregate functions for LINQ queries: All, Any, Average, Count, LongCount, Max, Min, Sum. These functions are documented in the reference topics for the Aggregate Clause. You can use aggregate functions as part of the Aggregate clause for an entire set of data, or as part of the Into portion of a Group By or Group Join clause, where the aggregate function will be applied to each group of data as shown in the following example.
Dim customersByCountry = From cust In customers _
Order By cust.City _
Group By CountryName = cust.Country _
Into RegionalCustomers = Group, Count() _
Order By CountryName
In this recipe, you will create a Windows Forms application that queries for items indexed using Windows Desktop Search. The application returns the t
otal count and total kilobytes for items that are documents, e-mails, or images. The application uses the OLEDB provider for Windows Desktop Search to retrieve item information and LINQ to DataSet to group the item information and apply the Count and Sum aggregate functions.
Instructions:
· Create a Windows Forms Application.
· From the Toolbox, drag a ListBox control, and a DataGridView control onto the form. Resize the form and controls as needed.
· Select the ListBox control. In the Properties page, locate the Items property and click the ellipsis (…) button to add items to the ListBox. Add the following three items and click OK:
o Document
o E-mail
o Image
· Double-click the ListBox control to edit the SelectedIndexChanged event and add the following code:
‘ Connect to the Desktop Search OLEDB provider. Return a list of items and the size
‘ of each item where the item type contains the search string selected from the ListBox.
Dim conn As New OleDbConnection(“Provider=Search.CollatorDSO.1;” & _
“Persist Security Info=False;” & _
“Extended Properties=’Application=Windows'”)
Dim cmd As New OleDbCommand(“SELECT System.ItemTypeText, System.Size “ & _
“FROM SystemIndex WHERE “ & _
“CONTAINS(System.ItemTypeText, ‘””” & ListBox1.SelectedItem & “””‘)”, conn)
Dim adapter As New OleDbDataAdapter(cmd)
Dim ds As New DataSet
adapter.Fill(ds, “SearchResults”)
‘ Group the results from the Desktop Search. Count the number of items in each
‘ group and sum the size of the items in each group.
Dim query = From row In ds.Tables(“SearchResults”) _
Let ItemTypeText = row(“System.ItemTypeText”), _
Size = CInt(row(“System.Size”)) _
Group By ItemTypeText Into TotalItems = Count(), Bytes = Sum(Size) _
Select Type = ItemTypeText, Count = TotalItems, _
Size = (Bytes / 1000).ToString(“#,###”) _
Order By Type
‘ Display the grouped results.
DataGridView1.DataSource = query.ToList()
Press F5 to see the code run. Click on the different search terms in the ListBox to see the grouped results including the count of each item and the size (in kilobytes) of each item group.
]]>