2010 05 09 the second problem of visitors ndash and a solution

Published on May 9th, 2010 at 17:29

The second problem of Visitors – and a solution

In my last post, I explained that one big weakness of the Visitor design pattern is that it can only be used when supported by the visited classes. In LINQ, I explained, we have a good scenario for the Visitor pattern because we have to define a lot of algorithms, the specifics of which depend on the node types present in an expression tree. Since, however, the LINQ expression class designers didn’t provide the necessary hooks, we can’t make use of the pattern (and are left with a bad simulation of Visitor that is built on runtime type checks).

I promised that in the course of this next post I’d talk about the second big weakness of the Visitor design pattern, but before I do this, I’d like to reiterate why Visitor is a Good Thing, despite its weaknesses. I’ll quote the good consequences of using Visitor from Design Patterns by Gamma, Helm, Johnson, and Vlissides, the famous Gang of Four (the explanations are my own):

Visitor makes adding new operations easy. It defines a framework that, when added to a class, allows easy addition of new operations. Classic case of Open/Closed.
A visitor gathers related operations and separates unrelated ones. It provides clean separation of concerns and avoids unrelated operations being entangled in a single class.
Visiting across class hierarchies. It allows processing of object graphs in which not all objects have a single base class.
Accumulating state. The visitor implementation can gather state while operating on the objects in the graph being visited.

Those are good consequences of using the pattern. But Design Patterns also lists that second big disadvantage I want to talk about: Adding new ConcreteElement classes is hard. What does that mean?

To explain, remember that Visitor requires an interface (or base class) that defines the contract of all visitor implementations:

interface IFruitVisitor
{
  void VisitApple (Apple apple);
  void VisitOrange (Orange orange);
}

This interface is required for the double dispatch mechanism Visitor is based on: the visited object decides what visitor method to call based on its own type.

So, the visitor interface contains one method per visitable class; in our example, those would be Apple and Orange. Now consider what happens if we add a new class, Boysenberry. To be able to write algorithms visiting instances of this new type, we’d need to add a VisitBoysenberry method to the interface. And thus to every existing visitor class based on that interface. Of which there might be many. In a nutshell: Adding new ConcreteElement classes is hard.

It might even be impossible.

We had that problem in the new SQL backend we’re currently building for re-linq. re-linq’s frontend defines the expression visitor interface, but the SQL backend defines a number of additional SQL-specific expression types. Adapting the visitor interface to add Visit... methods for those new expression types would require adding a dependency from the front-end to the SQL backend, which we didn’t want at all.

There are alternatives to the Visitor pattern that do not require the class hierarchy to be stable, some of which re-linq uses in places where a lot of extensibility is needed (e.g., a dictionary mapping node types to implementations of a strategy interface). But in the situation of transforming expression trees? Shouldn’t the Visitor pattern be the cleanest, most object-oriented approach to this problem?

Therefore, we thought hard to overcome the issue of low flexibility, and this is our solution:

visitors in re-linq

The .NET framework (mscorlib) defines the standard expression types. Normally, it would also define the standard visitor interface (or base class), but as mentioned before, re-linq has to simulate the Visitor implementation in its frontend (Remotion.Data.Linq). In that assembly, there are also several visitor implementations operating on the standard expression types, for example a PartialEvaluator. Those classes implement the standard visitor interface (deriving from ExpressionTreeVisitor) and thus only know how to deal with standard expression types.

The SQL backend (Remotion.Data.Linq.SqlBackend) defines additional expression types, e.g., SqlCaseExpression. It also defines special visitor interfaces for visitors that can handle the additional expressions, such as ISqlSpecificExpressionVisitor. Expression visitors in the backend, e.g., SqlGenerator, implement both the standard and the custom visitor interfaces and are thus able to visit both standard and custom expression nodes.

Here’s what the Accept method of SqlCaseExpression looks like:

public override Expression Accept (ExpressionTreeVisitor visitor)
{
  ArgumentUtility.CheckNotNull ("visitor", visitor);

  var specificVisitor = visitor as ISqlSpecificExpressionVisitor;
  if (specificVisitor != null)
    return specificVisitor.VisitSqlCaseExpression (this);
  else
    return base.Accept (visitor);

}

[Side note: the re-motion team has recently switched to Visual Studio 2010. As you can see, the Copy As HTML add-in can now copy the syntax highlighting performed by ReSharper. Here’s a description of how to adapt Copy As HTML to run in VS 2010.]

As you can see, the concept of custom expressions and dedicated visitors now requires a runtime type check: being of a non-standard expression type, the SqlCaseExpression instance has to determine whether the visitor can deal with it; only visitors implementing the ISqlSpecificExpressionVisitor interface have a VisitSqlCaseExpression method to dispatch to.

But what happens if a standard visitor, such as PartialEvaluator, is used to visit an object graph containing a custom expression, such as SqlCaseExpression?

The answer is in the call to base.Accept(). As you may have noticed, the listing above and the picture don’t actually match. Expression does not have an Accept method in .NET 3.5. So how can SqlCaseExpression override it?

The answer is that SqlCaseExpression is not derived directly from Expression, but from ExtensionExpression. That class contains boilerplate code for the Visitor pattern, and it contains the following base implementation of Accept:

public virtual Expression Accept (ExpressionTreeVisitor visitor)
{
  ArgumentUtility.CheckNotNull ("visitor", visitor);

  return visitor.VisitUnknownExpression (this);
}

So, if a visitor does not support a custom expression, a catch-all VisitUnknownExpression method is called. Depending on its purpose, a visitor can implement that method to ignore the unknown expression, throw an expression, or … do something completely different.

What that is, I will explain in my next blog post.

- Fabian

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2010 05 09 the second problem of visitors ndash and a solution

The second problem of Visitors – and a solution

Clone this wiki locally