N Rocking N Rocking - 1 month ago 31
C# Question

Extension method for IQueryable left outer join using LINQ

I am trying to implement Left outer join extension method with return type

IQueryable
.

The function that I have written is as follows

public static IQueryable<TResult> LeftOuterJoin2<TOuter, TInner, TKey, TResult>(
this IQueryable<TOuter> outer,
IQueryable<TInner> inner,
Func<TOuter, TKey> outerKeySelector,
Func<TInner, TKey> innerKeySelector,
Func<TOuter, TInner, TResult> resultSelector)
{
return
from outerItem in outer
join innerItem in inner on outerKeySelector(outerItem)
equals innerKeySelector(innerItem) into joinedData
from r in joinedData.DefaultIfEmpty()
select resultSelector(outerItem, r);
}


It can't generate the query. The reason might be: I have used
Func<>
instead of
Expression<>
. I tried with
Expression<>
as well. It gives me an error on
outerKeySelector(outerItem)
line, which is
outerKeySelector
is a variable which is being used as a method

I found some discussions on SO (such as here) and CodeProjects, but those work for
IEnumerable
types not for
IQueryable
.

Answer

Intro

This question is very interesting. The problem is Funcs are delegates and Expressions are trees, they are completely different structures. When you use your current extension implementation it uses loops and executes your selectors on each step for each element and it works well. But when we talk about entity framework and LINQ we need tree traversal for translation it to SQL query. So it's a "little" harder than Funcs (but I like Expressions anyway) and there are some problems described below.

When you want to do left outer join you can use something like this (taken from here: How to implement left join in JOIN Extension method)

var leftJoin = p.Person.Where(n => n.FirstName.Contains("a"))
                   .GroupJoin(p.PersonInfo, 
                              n => n.PersonId,
                              m => m.PersonId,
                              (n, ms) => new { n, ms = ms.DefaultIfEmpty() })
                   .SelectMany(z => z.ms.Select(m => new { n = z.n, m ));

It is good, but it is not extension method we need. I guess you need something like this:

using (var db = new Database1Entities("..."))
{
     var my = db.A.LeftOuterJoin2(db.B, a => a.Id, b => b.IdA, 
         (a, b) => new { a, b, hello = "Hello World!" });
     // other actions ...
}

There are many hard parts in creating such extensions:

  • Creating complex trees manually, compiler will not help us here
  • Reflection is needed for methods like Where, Select, etc
  • Anonymous types (!! we need codegen here?? I hope no)

Steps

Consider 2 simple tables: A (columns: Id, Text) and B (Columns Id, IdA, Text).

Outer join could be implemented in 3 steps:

// group join as usual + use DefaultIfEmpty
var q1 = Queryable.GroupJoin(db.A, db.B, a => a.Id, b => b.IdA, 
                              (a, b) => new { a, groupB = b.DefaultIfEmpty() });

// regroup data to associated list a -> b, it is usable already, but it's 
// impossible to use resultSelector on this stage, 
// beacuse of type difference (quite deep problem: some anonymous type != TOuter)
var q2 = Queryable.SelectMany(q1, x => x.groupB, (a, b) => new { a.a, b });

// second regroup to get the right types
var q3 = Queryable.SelectMany(db.A, 
                               a => q2.Where(x => x.a == a).Select(x => x.b), 
                               (a, b) => new {a, b});

Code

Ok, I'm not such a good teller, here is he code I have (Sorry I was unable to format it better, but it works!):

public static IQueryable<TResult> LeftOuterJoin2<TOuter, TInner, TKey, TResult>(
        this IQueryable<TOuter> outer,
        IQueryable<TInner> inner,
        Expression<Func<TOuter, TKey>> outerKeySelector,
        Expression<Func<TInner, TKey>> innerKeySelector,
        Expression<Func<TOuter, TInner, TResult>> resultSelector)
    {

        // generic methods
        var selectManies = typeof(Queryable).GetMethods()
            .Where(x => x.Name == "SelectMany" && x.GetParameters().Length == 3)
            .OrderBy(x=>x.ToString().Length)
            .ToList();
        var selectMany = selectManies.First();
        var select = typeof(Queryable).GetMethods().First(x => x.Name == "Select" && x.GetParameters().Length == 2);
        var where = typeof(Queryable).GetMethods().First(x => x.Name == "Where" && x.GetParameters().Length == 2);
        var groupJoin = typeof(Queryable).GetMethods().First(x => x.Name == "GroupJoin" && x.GetParameters().Length == 5);
        var defaultIfEmpty = typeof(Queryable).GetMethods().First(x => x.Name == "DefaultIfEmpty" && x.GetParameters().Length == 1);

        // need anonymous type here or let's use Tuple
        // prepares for:
        // var q2 = Queryable.GroupJoin(db.A, db.B, a => a.Id, b => b.IdA, (a, b) => new { a, groupB = b.DefaultIfEmpty() });
        var tuple = typeof(Tuple<,>).MakeGenericType(
            typeof(TOuter),
            typeof(IQueryable<>).MakeGenericType(
                typeof(TInner)
                )
            );
        var paramOuter = Expression.Parameter(typeof(TOuter));
        var paramInner = Expression.Parameter(typeof(IEnumerable<TInner>));
        var groupJoinExpression = Expression.Call(
            null,
            groupJoin.MakeGenericMethod(typeof (TOuter), typeof (TInner), typeof (TKey), tuple),
            new Expression[]
                {
                    Expression.Constant(outer),
                    Expression.Constant(inner),
                    outerKeySelector,
                    innerKeySelector,
                    Expression.Lambda(
                        Expression.New(
                            tuple.GetConstructor(tuple.GetGenericArguments()),
                            new Expression[]
                                {
                                    paramOuter,
                                    Expression.Call(
                                        null,
                                        defaultIfEmpty.MakeGenericMethod(typeof (TInner)),
                                        new Expression[]
                                            {
                                                Expression.Convert(paramInner, typeof (IQueryable<TInner>))
                                            }
                                )
                                },
                            tuple.GetProperties()
                            ),
                        new[] {paramOuter, paramInner}
                )
                }
            );

        // prepares for:
        // var q3 = Queryable.SelectMany(q2, x => x.groupB, (a, b) => new { a.a, b });
        var tuple2 = typeof (Tuple<,>).MakeGenericType(typeof (TOuter), typeof (TInner));
        var paramTuple2 = Expression.Parameter(tuple);
        var paramInner2 = Expression.Parameter(typeof(TInner));
        var paramGroup = Expression.Parameter(tuple);
        var selectMany1Result = Expression.Call(
            null,
            selectMany.MakeGenericMethod(tuple, typeof (TInner), tuple2),
            new Expression[]
                {
                    groupJoinExpression,
                    Expression.Lambda(
                        Expression.Convert(Expression.MakeMemberAccess(paramGroup, tuple.GetProperty("Item2")),
                                           typeof (IEnumerable<TInner>)),
                        paramGroup
                ),
                    Expression.Lambda(
                        Expression.New(
                            tuple2.GetConstructor(tuple2.GetGenericArguments()),
                            new Expression[]
                                {
                                    Expression.MakeMemberAccess(paramTuple2, paramTuple2.Type.GetProperty("Item1")),
                                    paramInner2
                                },
                            tuple2.GetProperties()
                            ),
                        new[]
                            {
                                paramTuple2,
                                paramInner2
                            }
                )
                }
            );

        // prepares for final step, combine all expressinos together and invoke:
        // var q4 = Queryable.SelectMany(db.A, a => q3.Where(x => x.a == a).Select(x => x.b), (a, b) => new { a, b });
        var paramTuple3 = Expression.Parameter(tuple2);
        var paramTuple4 = Expression.Parameter(tuple2);
        var paramOuter3 = Expression.Parameter(typeof (TOuter));
        var selectManyResult2 = selectMany
            .MakeGenericMethod(
                typeof(TOuter),
                typeof(TInner),
                typeof(TResult)
            )
            .Invoke(
                null,
                new object[]
                    {
                        outer,
                        Expression.Lambda(
                            Expression.Convert(
                                Expression.Call(
                                    null,
                                    select.MakeGenericMethod(tuple2, typeof(TInner)),
                                    new Expression[]
                                        {
                                            Expression.Call(
                                                null,
                                                where.MakeGenericMethod(tuple2),
                                                new Expression[]
                                                    {
                                                        selectMany1Result,
                                                        Expression.Lambda( 
                                                            Expression.Equal(
                                                                paramOuter3,
                                                                Expression.MakeMemberAccess(paramTuple4, paramTuple4.Type.GetProperty("Item1"))
                                                            ),
                                                            paramTuple4
                                                        )
                                                    }
                                            ),
                                            Expression.Lambda(
                                                Expression.MakeMemberAccess(paramTuple3, paramTuple3.Type.GetProperty("Item2")),
                                                paramTuple3
                                            )
                                        }
                                ), 
                                typeof(IEnumerable<TInner>)
                            ),
                            paramOuter3
                        ),
                        resultSelector
                    }
            );

        return (IQueryable<TResult>)selectManyResult2;
    }

Usage

And the usage again:

db.A.LeftOuterJoin2(db.B, a => a.Id, b => b.IdA, 
       (a, b) => new { a, b, hello = "Hello World!" });

Looking at this you can think what is the sql query for all this? It might be huge. Guess what? It's quite small:

SELECT 
1 AS [C1], 
[Extent1].[Id] AS [Id], 
[Extent1].[Text] AS [Text], 
[Join1].[Id1] AS [Id1], 
[Join1].[IdA] AS [IdA], 
[Join1].[Text2] AS [Text2], 
N'Hello World!' AS [C2]
FROM  [A] AS [Extent1]
INNER JOIN  (SELECT [Extent2].[Id] AS [Id2], [Extent2].[Text] AS [Text], [Extent3].[Id]    AS [Id1], [Extent3].[IdA] AS [IdA], [Extent3].[Text2] AS [Text2]
    FROM  [A] AS [Extent2]
    LEFT OUTER JOIN [B] AS [Extent3] ON [Extent2].[Id] = [Extent3].[IdA] ) AS [Join1] ON [Extent1].[Id] = [Join1].[Id2]

Hope it helps.

Comments