This post contains a solution for a problem I often face when writing a specific class of queries in an ORM framework. The simplest example of this class of query is: imagine we store user date, like images, and the users can assign a set of tags to their images. They might use tags like “Summer 2016”, “Outdoors”, “Friends”, etc.
Now we want to implement a search functionality where the user can specify sets of tags they wish to see photos of. For instance a user query might look like:
("Summer 2016" AND "Outdoors") OR ("Friends")
Let’s first consider a simplified version, with the following query:
"Outdoors" OR "Friends"
Iterative Unions (Don’t Work)
In this case we can take, as input to our function, a list of the tags we want to OR
together. Then we could use an iterative application of Union
to produce the desired
result set.
IQueryable<UserImage> AllImages;
public IQueryable<UserImage> TagQuery(IEnumerable<string> tags)
{
var result = Enumerable.Empty<UserImage>().AsQueryable();
foreach (var tag in tags)
{
result = result.Union(AllImages.Where(image => image.Tags.Contains(tag)));
}
return result;
}
This technically works, but at the time of writing this EF Core currently does not support
Union
. This will result in a query to the database for each tag a user supplies rather than
one single query which returns all of the images matching the user’s input.
Expression Form
In an ideal solution, we would have a single Where
expression that contains an
Expression<Func<UserImage, bool>>
that returns true
on images we want, and false
on
images we don’t. An Expression<Func<,>>
can be made in one of two ways. The most common
way is off limits to us, we cannot write a compile time constant lambda expression that
will satisfy the needs of our query because the terms in the expression will vary depending
on the user’s input. The expression we’re aiming for looks something like:
image => image.Tags.Contains("Outdoors") || image.Tags.Contains("Friends")
So the next question might be, can we construct this expression by building it up out of several compile time expressions? Perhaps surprisingly, the answer is yes.
LINQ Expression Solution
I’d like to quickly show what the solution looks like, describe it in a little detail and then finally show the implementation at the very bottom. Here’s what the solution looks like:
var tags = new[] { "Outdoors", "Friends" };
var exp = tags.SelectExp<string, UserImage, bool>(tag => image => image.Tags.Contains(tag))
.BinaryCombinator(Expression.Or);
return AllImages.Where(exp);
The non-standard SelectExp
and BinaryCombinator
create the expression we’re looking for.
In this example we create a subexpression for each tag in tags
and ||
them all together.
This produces exactly the expression we were looking for in the previous section.
SelectExp
This extension method is really just syntactic sugar, it makes writing a Select
that takes
lambda expressions a little more convenient. The full body is simple:
public static IEnumerable<Expression<Func<T, U>>> SelectExp<V, T, U>(this IEnumerable<V> vs, Func<V, Expression<Func<T, U>>> selector)
{
return vs.Select(selector);
}
The import thing to note is that we’re writing a Select
which returns an Expression<Func<T, U>>
, which means the selector is a lambda whose return value is a lambda. It’s a bit
meta, but it then allows us to create an IEnumerable<Expression<Func<T, U>>>
which contains
one expression per tag that we care about. The result of the SelectExp
in our example looks
something like:
[
image => image.Tags.Contains("Outdoors"),
image => image.Tags.Contains("Friends")
]
BinaryCombinator
Now comes the tricky part, we have a list of lambda expressions that are pretty close to what
we want. The last problem is to just ||
all these expressions together. This is kind of
hand wavey, but the steps to do this look roughly as follows.
- Remove the image parameter part of the lambdas
[ image.Tags.Contains("Outdoors"), image.Tags.Contains("Friends") ]
- Or all the terms together
image.Tags.Contains("Outdoors") || image.Tags.Contains("Friends")
- Create a new lambda out of the result
image => image.Tags.Contains("Outdoors") || image.Tags.Contains("Friends")
All these steps are done inside of BinaryCombinator
which also generalizes step #2 to any
aggregation of expression terms, meaning it can work for ||
as well as &&
.
Full Solution Source
I’ve left out a few subtle details in the explanation, but for those who care I think the
source of the solution contains a better explanation than I can write out. The one really
unexplained piece is ConstantReplacer
, which is necessary because of the way that C#
lambdas capture free variables. I would highly encourage anyone who is curious to just
debug into some of these methods, the debug view of expressions is really quite nice.
Disclosure: use in production at your own risk, this likely contains bugs and/or security flaws
public static class LINQExpressionExtensions
{
public static Expression<Func<TElement, bool>> BinaryCombinator<TElement>(
this IEnumerable<Expression<Func<TElement, bool>>> lambdas, Func<Expression, Expression, Expression> combinator)
{
var parameter = Expression.Parameter(typeof(TElement), "element");
var constantReplacer = new ConstantReplacer();
return Expression.Lambda<Func<TElement, bool>>(
lambdas
.Select(lambda => new ParameterReplacerVisitor(lambda.Parameters[0], parameter).VisitAndConvert(lambda.Body, "BinaryCombinator"))
.Select(body => constantReplacer.VisitAndConvert(body, "BinaryCombinator"))
.Aggregate(combinator), parameter);
}
class ConstantReplacer : ExpressionVisitor
{
protected override Expression VisitMember(MemberExpression node)
{
var member = node.Member;
if (node.Expression.NodeType == ExpressionType.Constant && member.MemberType == MemberTypes.Field)
{
var field = member.DeclaringType.GetField(member.Name);
return Expression.Convert(
Expression.Constant(field.GetValue(((ConstantExpression)node.Expression).Value)),
field.FieldType);
}
return base.VisitMember(node);
}
}
class ParameterReplacerVisitor : ExpressionVisitor
{
ParameterExpression ReplaceWith;
ParameterExpression ToReplace;
public ParameterReplacerVisitor(ParameterExpression toReplace, ParameterExpression replaceWith)
{
ToReplace = toReplace;
ReplaceWith = replaceWith;
}
protected override Expression VisitParameter(ParameterExpression node)
{
return node == ToReplace ? ReplaceWith : base.VisitParameter(node);
}
}
public static IEnumerable<Expression<Func<T, U>>> SelectExp<V, T, U>(this IEnumerable<V> vs, Func<V, Expression<Func<T, U>>> selector)
{
return vs.Select(selector);
}
}