Dynamic Searching using LINQ – Dataset + Joins


About two years ago I posted some code that shows how to dynamically construct LINQ queries at runtime.  On average there’s a couple questions per month about trying it in some edge case, and usually the support already exists within the API.  Recently though someone posted a question that has to do with Datasets and Joins that raised some interesting issues:

I have a dataset with two datatables (One Loaded From A CSV file) and (the other loaded from a query) what I would like to be able to do is set the criteria (i.e dtCSV.Field1=dtQuery.Field7 AND (dtCSV.Field5 != dtQuery.Field4 or dtCSV.Field5 != dtQuery.Field9)

Returning dtQuery.Field1, dtCSV.Field5

Previously the API supported conditions of the form “object.property=value”, but here we have something of the form “object1.property = object2.property”.  It’s still a completely valid expression, but it means the API will need to generate a different expression tree.

Let’s walk through how you can add this support to the API:

1. At the end of Module DynamicQuery add the following extension method:

‘Extension method that can be called off any type that implements IEnumerable(Of T),

    ‘which constructs a Condition with T as the element type and compares one property (or field) with another

    <Extension(), EditorBrowsable(EditorBrowsableState.Always)> _

    Public Function CreateMutliColumnCondition(Of T)(ByVal dataSource As IEnumerable(Of T), _

                                             ByVal propName As String, _

                                             ByVal condType As condition.Compare, _

                                             ByVal propName2 As String) As Condition(Of T)


        Return Condition.CreateMutliColumn(dataSource, propName, condType, propName2)

    End Function

2. Add this method to the Condition class:

 ‘Constructs a Condition with T as the element type that compares one field/property with another

        Public Shared Function CreateMutliColumn(Of T)(ByVal dataSource As IEnumerable(Of T), _

                                            ByVal propertyName As String, _

                                            ByVal condType As Compare, _

                                            ByVal propertyName2 As String) As Condition(Of T)


            Return New Condition(Of T)(propertyName, condType, propertyName2)

        End Function

3. Add the following code to the Condition(Of T) class:


       ‘Returns a MemberExpression for a property or field access.  Also handles nesting scenarios

        Private Function GetPropertyExpression(ByVal propName As String, ByVal paramExpr As ParameterExpression) As MemberExpression

            ‘Split the string to handle nested property access

            Dim s = propName.Split(“.”c)


            ‘Get the PropertyInfo instance for propName

            Dim pInfo = GetType(T).GetProperty(s(0))

            Dim callExpr = Expression.MakeMemberAccess(paramExpr, pInfo)


            ‘For each member specified, construct the additional MemberAccessExpression

            ‘For example, if the user says “myCustomer.Order.OrderID = 4” we need an

            ‘additional MemberAccessExpression for “Order.OrderID = 4”

            For i = 1 To UBound(s)

                pInfo = pInfo.PropertyType.GetProperty(s(i))

                callExpr = Expression.MakeMemberAccess(callExpr, pInfo)


            Return callExpr

        End Function


        Friend Sub New(ByVal propName As String, ByVal condType As Compare, ByVal propName2 As String)

            Dim paramExpr = GetParamInstance(GetType(T))


            Dim b As Expression = CombineExpression(GetPropertyExpression(propName, paramExpr), _

                                                    condType, _

                                                    GetPropertyExpression(propName2, paramExpr))


            LambdaExpr = Expression.Lambda(Of Func(Of T, Boolean))(b, New ParameterExpression() {paramExpr})


            ‘Compile the lambda expression into a delegate

            del = DirectCast(LambdaExpr.Compile(), Func(Of T, Boolean))

        End Sub

4. (optionally) the old constructor in this method should be refactored to use the helper method we introduced in #3.

        Friend Sub New(ByVal propName As String, ByVal condType As Compare, ByVal value As Object, ByVal valueType As Type)

            Dim paramExpr = GetParamInstance(GetType(T))


            Dim callExpr = GetPropertyExpression(propName, paramExpr)


            ‘ConstantExpression representing the value on the left side of the operator

            Dim valueExpr = Expression.Constant(value, valueType)


            Dim b As Expression = CombineExpression(callExpr, condType, valueExpr)

            LambdaExpr = Expression.Lambda(Of Func(Of T, Boolean))(b, New ParameterExpression() {paramExpr})


            ‘Compile the lambda expression into a delegate

            del = DirectCast(LambdaExpr.Compile(), Func(Of T, Boolean))

        End Sub


Now that the API supports it, we can do joins against Datasets:

        ‘The code in this method is equivalent to the following query, but constructed dynamically

        ‘Dim filteredQuery = From csvRow In dtCSV _

                            Join orderRow In dtOrders On csvRow.CustomerID Equals orderRow.CustomerID


        Dim dtCSV As New myDataset.dtCSVDataTable


        dtCSV.Rows.Add(1, “Redmond”)

        dtCSV.Rows.Add(2, “Toronto”)

        dtCSV.Rows.Add(3, “Seattle”)


        Dim dtOrders As New myDataset.dtOrdersDataTable


        dtOrders.Rows.Add(10001, 1)

        dtOrders.Rows.Add(10002, 3)

        dtOrders.Rows.Add(10003, 4)

        dtOrders.Rows.Add(10004, 2)



        Dim query = From csvRow In dtCSV _

                    From orderRow In dtOrders


        Dim c = query.CreateMutliColumnCondition(“csvRow.CustomerID”, Compare.Equal, “orderRow.CustomerID”)


        Dim filteredQuery = query.Where(c)


        DataGridView1.DataSource = (From row In filteredQuery _

                                    Select row.orderRow.OrderID, row.csvRow.CustomerID, row.csvRow.City).ToList()


The key line here is this one:

        Dim c = query.CreateMutliColumnCondition(“csvRow.CustomerID”, Compare.Equal, “orderRow.CustomerID”)

Instead of comparing a column to a value we can now compare one column to another (either in the same table or in a different table).

<geeky overload resolution details>

Why did I name the method CreateMultiColumnCondition instead of just adding another overload called CreateCondition?  To avoid introducing a breaking change!  If you had code that called CreateCondition with the last parameter as a String, the compiler would infer the type parameter at that position to be String.  But now with a new overload that also has a parameter of type String in that position, the compiler would now choose the new overload because that method is “less generic” (i.e. a closer match).

I don’t like the idea of having to think through advanced overload resolution details to figure out which method will actually get called, so it seemed easier to just rename the method 🙂


A better (and likely more performant) implementation would actually introduce an overload of Join similar to this one for Where:

    ‘Filters an IEnumerable(Of T) according to the specified condition

    <Extension()> _

    Public Function Where(Of T)(ByVal source As IEnumerable(Of T), _

                                ByVal condition As Condition(Of T)) As IEnumerable(Of T)

        Return source.Where(condition.del)

    End Function


This would involve a bit more work in the API though as the Join method takes 3 delegates rather than 1.  (so Condition(Of T) would have to generate 3 smaller trees rather than one big one).  It’s already almost 6 on a Friday though and I’m lazy, so this is left as an exercise to the reader 🙂


Full VS2008 .sln is attached.






Leave a comment

Feedback usabilla icon