.NET Junkie - Meanwhile... on the query side of my architecture

18 December 11

Meanwhile... on the query side of my architecture

Command-query separation is a common concept in the software industry. Although defined for behavior on a method level, the concept can be applied to classes and components as well. Fairly common are architectures where commands are separated from the rest of the system and sent as messages which are processed by ‘handlers’. This same concept of message passing is equally applicable on the query side. Unfortunately architectures like these are very uncommon. This article tries to change this. Two simple interfaces will change the look of your architecture... forever.

In my previous post I described how I design the command side of my architecture. The greatest part of this design is that it enables a lot of flexibility, and lowers the overall complexity of the system, just by adding a simple interface to the system and grouping business logic in a certain manner. The design is founded on the SOLID principles and brought to life with Dependency Injection (although DI is optional). Please read that post, if you haven’t yet. It will refer to its content often.

It’s funny though, that I encountered the command/handler design a few years ago, but never understood why you would use two classes (a message and the behavior) for one single operation. It just didn’t seem very object oriented to me. It was only after I experienced problems with the old design, that the usefulness of the command/handler design became clear to me.

It was this type of discomfort that triggered me to think about the design of a different part of my application architecture. Although the part of the business layer that handles business processes was modeled uniformly and allowed great flexibility, that same didn’t hold for the part of the business layer that was responsible to querying. For the query side I modeled queries as methods with clear names and grouped them together in a class. This lead to interfaces like the following:

public interface IUserQueries
{
    User[] FindUsersBySearchText(string searchText, 
        bool includeInactiveUsers);
 
    User[] GetUsersByRoles(string[] roles);
 
    UserInfo[] GetHighUsageUsers(int reqsPerDayThreshold);

    // More methods here
}

There is a variation of this pattern that a lot of developers use today in their applications. They mix this query class with the repository pattern. The repository pattern is used for CRUD operations. The following code might look familiar to you:

// Generic repository class (good)
public interface IRepository<T>
{
    T GetByKey(int key);

    void Save(T instance);

    void Delete(T instance);
}

// Custom entity-specific repository with query methods (awkward)
public interface IUserRepository : IRepository<User>
{
    User[] FindUsersBySearchText(string searchText, 
        bool includeInactiveUsers);
 
    User[] GetUsersByRoles(string[] roles);
 
    UserInfo[] GetHighUsageUsers(int reqsPerDayThreshold);

    // More methods here
}

Besides this IUserQueries interface, my application contained interfaces such as IPatientInfoQueries, ISurgeryQueries, and many, many more, each had its own set of methods with own set of parameters and return types. Every interface was different, which made adding cross-cutting concerns, such as logging, caching, profiling, security and audit trailing, which I often wanted to add to all queries, very hard. Besides that, I missed the same uniformity in the design that I had with my command handlers. Those query classes were just a bunch of random methods, often grouped around one concept or one entity. Or at least I tried. Still, it looked messy and every time a query method was added, the interface and the implementation of that interface had to be changed.

In my automated test suite it even got worse. A class under test that depended on such a query interface was often only expected to call one or two of the methods of such interface, while other classes were expected to call other methods on it. This lead me to do asserts in my test suite to ensure a class didn’t call unexpected methods. This resulted in the creation of an abstract base class in my test project that implemented that interface. That abstract class looked like this:

public abstract class FakeFailingUserQueries : IUserQueries
{
    public virtual User[] FindUsersBySearchText(
        string searchText, bool includeInactive)
    {
        Assert.Fail("Call to this method was not expected.");
        return null;
    }
 
    public virtual User[] GetUsersByRoles(string[] roles)
    {
        Assert.Fail("Call to this method was not expected.");
        return null;
    }
        
    public virtual UserInfo[] GetHighUsageUsers(
        int requestsPerDayThreshold)
    {
        Assert.Fail("Call to this method was not expected.");
        return null;
    }

    // More methods here
}

For a certain set of tests I would then override this base class and implement one of the methods:

public class FakeUserServicesUserQueries : FakeFailingUserQueries
{
    public User[] UsersToReturn { get; set; }
 
    public string[] CalledRoles { get; private set; }
 
    public override User[] GetUsersByRoles(string[] roles)
    {
        this.CalledRoles = roles;
 
        return this.UsersToReturn;
    }
}

This way I could let all other methods fail, since they were not expected to be called, while preventing me from having to write too much code and make it harder to make mistakes in my tests. However, this still lead to an explosion of test classes in my test projects.

Of course al the described problems can be solved with ‘proper’ tooling. For instance, cross-cutting concerns can be added by using compile-time code weaving (using PostSharp for instance), or by configuring your DI container using convention based registration, mixed with interception, which uses dynamic proxy generation and lightweight code generation. The testing problems could be fixed by using Mocking frameworks, which also generate proxy classes that act like the original class.

Although all these solutions work, they only makes things more complicated, and they are patches to hide problems with the initial design. When we validate the design against the five SOLID principles, we can see clearly where the problem lies. The design violates three out of five SOLID principles.

The Single Responsibility Principle is violated, because the methods in such class are not highly cohesive. The only thing that relates those methods is the fact that they belong to the same concept or entity.

The design violates the Open/Closed Principle, because almost every time a query is added to the system, an existing interface -and its implementations- need to be changed. Every interface has at least two implementations (one real implementation and one test implementation).

The Interface Segregation Principle is violated, because the interfaces are wide (have many methods) and consumers of those interfaces are forced to depend on methods that they don’t use.

So let us not treat the symptoms; let’s fix the cause.

A better design

Instead of having a separate interface per group of queries, we can define a single interface for all queries in the system, just as we saw with the ICommandHandler<TCommand> interface in my previous article. We need to define the following two interfaces:

public interface IQuery<TResult>
{
}
 
public interface IQueryHandler<TQuery, TResult>
    where TQuery : IQuery<TResult>
{
    TResult Handle(TQuery query);
}

The IQuery<TResult> specifies a message that defines a specific query with the data it returns using the TResult generic type. This interface doesn’t have any members (note it is not a marker interface, since it contains a generic type argument) and doesn’t look very useful, but bear with me, as I will explain this later on why having such interface is crucial.

Although commands are (most often) fire and forget that (usually) don’t return a value, queries are the opposite in that they will (or at least they should) not change state, and do return a value.

With the previously defined interface we can define a query message like this:

public class FindUsersBySearchTextQuery : IQuery<User[]>
{
    public string SearchText { get; set; }
 
    public bool IncludeInactiveUsers { get; set; }
}

This class defines a query operation with two parameters, which will result in an array of User objects. Just as the command, this class is a Parameter Object. The class that handles this message can be defined as follows:

public class FindUsersBySearchTextQueryHandler
    : IQueryHandler<FindUsersBySearchTextQuery, User[]>
{
    private readonly NorthwindUnitOfWork db;
 
    public FindUsersBySearchTextQueryHandler(
        NorthwindUnitOfWork db)
    {
        this.db = db;
    }
 
    public User[] Handle(FindUsersBySearchTextQuery query)
    {
        return (
            from user in this.db.Users
            where user.Name.Contains(query.SearchText)
            select user)
            .ToArray();
    }
}

Just as we’ve seen with the command handlers, we can now let consumers depend the generic IQueryHandler interface:

public class UserController : Controller
{
    IQueryHandler<FindUsersBySearchTextQuery, User[]> handler;
 
    public UserController(
        IQueryHandler<FindUsersBySearchTextQuery, User[]> handler)
    {
        this.handler = handler;
    }
 
    public View SearchUsers(string searchString)
    {
        var query = new FindUsersBySearchTextQuery
        {
            SearchText = searchString,
            IncludeInactiveUsers = false
        };
 
        User[] users = this.handler.Handle(query);

        return this.View(users);
    }
}

Immediately this model gives us a lot of flexibility, because we can now decide what to inject into the UserController. As we’ve seen in the previous article, we can inject a completely different implementation, or one that wraps the real implementation, without having to make changes to the UserController (and all other consumers of that interface).

By the way, here is where the IQuery<TResult> interface comes into play. First of all, it prevents us from having to cast the return type from object to User[] in this case. It therefore gives us compile-time support when working with the handler. But on top of that, it gives us compile-time support when specifying or injecting IQueryHandlers in our code. When we change the FindUsersBySearchTextQuery to return UserInfo[] instead (by implementing IQuery<UserInfo[]>), the UserController will fail to compile, since the generic type constraint on IQueryHandler<TQuery, TResult> won't be able to map FindUsersBySearchTextQuery to User[].

Injecting the IQueryHandler interface into a consumer however, has a few less obvious problems that need to be addressed. For instance, the number of dependencies our consumers might get. It is called constructor over-injection when a constructor takes too many arguments (the rule of thumb is that a ctor should take no more than 5 arguments). This is an anti-pattern and is often a signal of the violation of the Single Responsibility Principle (SRP). Although it is important to adhere to the SRP, it is very likely that consumers execute multiple different queries, without really violating the SRP (which is in contrast with injecting many ICommandHandler<TCommand> implementations. That would certainly be a violation of the SRP). I experienced the number of queries a class executes to change frequently, which also triggers changes in the number of constructor arguments.

Another shortcoming of this approach is that the generic structure of the IQueryHandler<TQuery, TResult> leads to a lot of infrastructural code, which makes reading the code harder. Take for instance at the following class:

public class Consumer
{
    IQueryHandler<FindUsersBySearchTextQuery, IQueryable<UserInfo>> findUsers;
    IQueryHandler<GetUsersByRolesQuery, IEnumerable<User>> getUsers;
    IQueryHandler<GetHighUsageUsersQuery, IEnumerable<UserInfo>> getHighUsage;
 
    public Consumer(
        IQueryHandler<FindUsersBySearchTextQuery, IQueryable<UserInfo>> findUsers,
        IQueryHandler<GetUsersByRolesQuery, IEnumerable<User>> getUsers,
        IQueryHandler<GetHighUsageUsersQuery, IEnumerable<UserInfo>> getHighUsage)
    {
        this.findUsers = findUsers;
        this.getUsers = getUsers;
        this.getHighUsage = getHighUsage;
    }
}

Wow!! That’s a lot of code. And this class only has three different queries it wishes to execute. This is caused by the verbosity of the C# language. A way around this (besides switching to another language) is by using a T4 template that generates the constructor for you in a new partial class. This would -in the previous example- leave you with just the three lines defining the private fields. The generic typing would still be a bit verbose, but with C# there's nothing much we can do about that.

So how do we fix the problem of having to inject too many IQueryHandlers? As always, with an extra layer of abstraction :-). We can create a mediator that sits in between the consumers and the query handlers:

public interface IQueryProcessor
{
    TResult Process<TResult>(IQuery<TResult> query);
}

The IQueryProcessor is a non-generic interface with one (generic) method. As you can see in the interface definition, the IQueryProcessor depends on the IQuery<TResult> interface. This allows us to have compile time support in our consumers that depend on the IQueryProcessor. Let’s rewrite the UserController to use the new IQueryProcessor:

public class UserController : Controller
{
    private IQueryProcessor queryProcessor;
 
    public UserController(IQueryProcessor queryProcessor)
    {
        this.queryProcessor = queryProcessor;
    }
 
    public View SearchUsers(string searchString)
    {
        var query = new FindUsersBySearchTextQuery
        {
            SearchText = searchString
        };
 
        // Note how we omit the generic type argument,
        // but still have type safety.
        User[] users = this.queryProcessor.Process(query);

        return this.View(users);
    }
}

See how the UserController now depends on a IQueryProcessor that can handle all kinds of queries. The UserController’s SearchUsers method now calls the IQueryProcessor.Process method supplying the query object. Since the FindUsersBySearchTextQuery implements the IQuery<User[]> interface, we can supply it to this generic Execute<TResult>(IQuery<TResult> query) method. Because of C# type inference, the C# compiler is able to determine the used generic type and this prevents us from spelling the generic type out. And because of this, the return type of the Process method is also known. Thus, when we let the FindUsersBySearchTextQuery implement a different interface (say IQuery<IQueryable<User>>) the UserController will not compile anymore, instead of failing at runtime.

Now it is the responsibility of the implementation of the IQueryProcessor interface to find out which IQueryHandler it should get to execute. It takes a bit of dynamic typing, and optionally the use of a Dependency Injection framework, and it can be done with just a few lines of code:

sealed class QueryProcessor : IQueryProcessor
{
    private readonly Container container;

    public QueryProcessor(Container container)
    {
        this.container = container;
    }

    [DebuggerStepThrough]
    public TResult Process<TResult>(IQuery<TResult> query)
    {
        var handlerType = typeof(IQueryHandler<,>)
            .MakeGenericType(query.GetType(), typeof(TResult));

        dynamic handler = container.GetInstance(handlerType);

        return handler.Handle((dynamic)query);
    }
}

This QueryProcessor class constructs a specific IQueryHandler<TQuery, TResult> type based on the supplied query instance. This type is used to ask the supplied container class to get a new instance of that handler. Unfortunately we need to call the Handle method using reflection (by using the C# 4.0 dymamic keyword in this case), because at that point it is impossible to cast the handler instance, since the generic TQuery argument is not available at compile time. However, unless the Handle method is renamed or gets other arguments, this call will never fail> and if you insist, it is very easy to write a unit test for that. There’s only a slight drop in performance when doing this, but nothing much to worry about (especially when you're using the Simple Injector as your DI framework, because it is blazingly fast).

I did consider an alternative design of the IQueryProcessor interface by the way, that looked like this:

public interface IQueryProcessor
{
    TResult Process<TQuery, TResult>(TQuery query)
        where TQuery : IQuery<TResult>;
}

This interface does solve the problem of having to do dynamic typing in the QueryProcessor implementation completely, but unfortunately the C# compiler isn’t ‘smart’ enough to find out which types are needed (damn you Anders!), which forces us to completely write out the call to Process, including both the generic arguments. This gets really ugly in code and is therefore not advisable. I was a bit amazed by this, because I was under the assumption that the C# compiler could infer this. However, the more I think about this, the more it makes sense that the C# compiler doensn't do this.

There’s one important thing you should be aware of when using the IQueryProcessor abstraction. By injecting an IQueryProcessor, we make it unclear which queries a consumer is using. This makes unit testing more fragile, since the constructor doesn’t state what the class depends on. Besides this, we make it harder for our DI framework to verify the object graph that is created, since the creation of IQueryHandler implentations is postponed by the IQueryProcessor. Being able to verify the container's configuration is very important. Using the IQueryProcessor means that you will have to write a test that checks if there is a corresponding query handler for each query in the system, because the DI framework can not check this for you. I believe I personally can live with this in the applications I worked on, but I wouldn’t use such an abstraction too often. I wouldn’t want to have an ICommandProcessor for executing commands, for instance. The reason for this is that consumers are less likely to take a dependency on many command handlers. And if they do, you are probably violating the SRP anyway.

One word of advice: When you start using this design, start out without the IQueryProcessor abstraction because of the reasons I described. It can always be added later on without any problem.

A consequence of the design based on the IQueryHandler interface is that there will be a lot of small classes in the system. Having a lot of small / focused classes (with clear names) is a good thing, but in this scenario, it might give some overhead, since every query handler would have a constructor that takes some dependencies and stores them in local variables (like I said, C# is a very verbose for doing constructor injection, but it's currently still the best language we've got).

There are ways to remove that overhead (besides using the T4 template I described before), if it bothers you. You can for instance merge multiple query handlers into a single class, as follows:

public class UserQueryHandlers :
    IQueryHandler<FindUsersBySearchTextQuery, User[]>,
    IQueryHandler<GetUsersByRolesQuery, User[]>,
    IQueryHandler<GetHighUsageUsersQuery, UserInfo[]>
{
    private readonly NorthwindUnitOfWork db;
 
    public UserQueryHandlers(NorthwindUnitOfWork db)
    {
        this.db = db;
    }
 
    public User[] Handle(FindUsersBySearchTextQuery query)
    {
        return (
            from user in this.db.Users
            where user.Name.Contains(query.SearchText)
            select user)
            .ToArray();
    }
 
    public User[] Handle(GetUsersByRolesQuery query)
    {
        // Query here
    }
 
    public UserInfo[] Handle(GetHighUsageUsersQuery query)
    {
        // Query here
    }

    // More methods here.
}

Although this UserQueryHandlers class really looks like the initial design we tried to prevent, there is one crucial difference: it implements IQueryHandler<TQuery, TResult> multiple times, once per query. This allows us to register this class multiple times, once per implemented interface. Although this class again violates the SRP, it still gives us all the previously described advantages (and doesn't violate the OCP and ISP).

When using a Dependency Injection framework, we can often register all query handlers with a single call (depending on your framework of choice), simply because all handlers implement the IQueryHandler<TQuery, TResult> interface. Your mileage may vary, but with the Simple Injector, the registration looks like this:

container.RegisterManyForOpenGeneric(typeof(IQueryHandler<,>),
    typeof(IQueryHandler<,>).Assembly);

This line of code saves you from having to change the DI configuration when you add new query handlers to the system. They will be picked up automatically.

With this in place we can now add cross-cutting concerns such as logging, audit trailing, and what have you. Or let’s say you want to decorate properties of the query objects with Data Annotations attributes, to be able to do validation. This might look like this:

public class FindUsersBySearchTextQuery : IQuery<User[]>
{
    // Required and StringLength are attributes from the
    // System.ComponentModel.DataAnnotations assembly.
    [Required]
    [StringLength(1)]
    public string SearchText { get; set; }
 
    public bool IncludeInactiveUsers { get; set; }
}

Because we modeled our query handlers around a single IQueryHandler<TQuery, TResult> interface, we can define a simple decorator that allows us to do this:

public class ValidationQueryHandlerDecorator<TQuery, TResult>
    : IQueryHandler<TQuery, TResult>
    where TQuery : IQuery<TResult>
{
    private readonly IServiceProvider provider;
    private readonly IQueryHandler<TQuery, TResult> decorated;
 
    [DebuggerStepThrough]
    public ValidationQueryHandlerDecorator(
        Container container,
        IQueryHandler<TQuery, TResult> decorated)
    {
        this.provider = container;
        this.decorated = decorated;
    }
 
    [DebuggerStepThrough]
    public TResult Handle(TQuery query)
    {
        var validationContext =
            new ValidationContext(query, this.provider, null);
 
        Validator.ValidateObject(query, validationContext);

        return this.decorated.Handle(query);
    }
}

This decorator allows us to validate query objects before they get sent to their handler, without us having to change a single line of code in the application, expect the following single line of code in our start-up path:

container.RegisterDecorator(typeof(IQueryHandler<,>),
    typeof(ValidationQueryHandlerDecorator<,>));

And if you're paranoid about performance and worry that it will give to too much overhead to wrap query handlers that don't need validation with a decorator, Simple Injector allows you to easily configure a conditional decorator:

container.RegisterDecorator(typeof(IQueryHandler<,>),
    typeof(ValidationQueryHandlerDecorator<,>),
    context => ShouldQueryHandlerBeValidated(context.ServiceType));

The applied predicate is evaluated just once per closed generic IQueryHandler<TQuery, TResult> type, so there's no performance loss in registering a conditional decorator, or at least, with the Simple Injector there isn't. As I said, your mileage may very when using other DI frameworks. Of course, you will still ha