2010 04 10 virtual methods outside of classes the visitor pattern

Published on April 10th, 2010 at 17:45

Virtual methods outside of classes: The Visitor Pattern"

A great many years ago (i.e., in 2003), I worked at a the research institution “CT SE2” at Siemens in Munich, Germany. At that time, a friend and myself implemented a Profiling-API-based AOP tool for the .NET 1.1 runtime, and that tool had to be written in C++. Because C++ is a difficult language that can be devastatingly complex, but also extremely elegant, depending on how you use it, I started to read the excellent _Conversation_s column published by Jim Hyslop and Herb Sutter in the Dr. Dobb’s journal. In Conversations, the two authors would present common C++ idioms, pitfalls, and patterns wrapped in short stories about a young programmer who is taught those things by the Guru.

One of the stories, called “By Any Other Name”, was about the Visitor pattern. The story described how Visitor was not a good name for the pattern:

“Consider the parable yet again. What is Personnel::Accept?” – "Uh, um?" I said rather dimly. "This class implements the Poorly Named Pattern, or PNP. It is also known as the Visitor pattern." "Uh, um!" I said, a little more brightly now. "I've read about Visitor. But that's just for having objects that can iteratively visit each other. Isn't it?" She sighed. "A common misapprehension. Think that the V means, not so much Visitor, but Virtual," she explained. "The PNP's most useful application is to enable the addition of virtual functions to an existing class hierarchy without further changing that hierarchy.

The visitor pattern is often seen as a solution for iterating over an inhomogenous collection of objects. The story was about how it’s really much more than that – it enables “the addition of virtual functions to an existing class hierarchy without further changing that hierarchy”, as the Guru described. But even if you already know the Visitor pattern, it might not exactly be clear at the first glance what this actually means.

To explain this idea, let’s try to motivate an implementation of the pattern via a vitamin-packed example. Consider a class hierarchy, such as Fruit, Apple, and Orange, with Apple and Orange each derived from Fruit. Consider further that you have the requirement to be able to eat a piece of fruit, the algorithm of course depending on what kind of fruit it is. You might realize this operation by adding an abstract method on the Fruit class and implement it in the derived classes:

abstract class Fruit
{
  public abstract void BeEaten (Person eater);
}

class Apple : Fruit
{
  public override void BeEaten(Person eater)
  {
    while (HasFlesh)
      eater.Bite (this);
  }

  // ...
}

class Orange : Fruit
{
  public override void BeEaten(Person eater)
  {
    eater.Peel (this);
    while (Slices.Count != 0)
    {
      var slice = Slices.RemoveNext ();

      while (slice.HasFlesh)
      {
        eater.Bite (slice);
      }
    }
  }

  // ...
}

And after implementing this piece of code, you’d hopefully think, “BeEaten? What kind of method name is that?”

The awkwardness of the method name is a sign that the code is not in the right location – in reality, not the Fruit should know how to be eaten, the person should know how to eat the different kinds of fruit. Time for refactoring – a naive Person.Eat method could look like this:

class Person
{
  public void Eat (Fruit fruit)
  {
    var apple = fruit as Apple;
    if (apple != null)
    {
      while (apple.hasFlesh)
        Bite (apple);

      return;
    }

    var orange = fruit as Orange;
    if (orange != null)
    {
      Peel (orange);
      while (orange.Slices.Count != 0)
      {
        var slice = orange.Slices.RemoveNext ();
        while (slice.HasFlesh)
        {
          Bite (slice);
        }
      }

      return;
    }

    throw new NotSupportedException (  
        "Don't know how to eat a(n) " + fruit.GetType ().Name + ".");
  }

  // ...

}

Now, the code is in the right place (the Person class), but that kind of type test is of course not a good coding practice, and it’s not acceptable in most cases.

A better solution is to apply the Visitor pattern. Effectively, a Person is just a FruitVisitor:

interface IFruitVisitor
{
  void VisitApple (Apple apple);
  void VisitOrange (Orange orange);
}

class Person : IFruitVisitor
{
  public void Eat (Fruit fruit)
  {
    fruit.Accept (this);
  }

  void IFruitVisitor.VisitApple(Apple apple)
  {
    while (apple.hasFlesh)
      Bite (apple);
  }

  void IFruitVisitor.VisitOrange(Orange orange)
  {
    Peel (this);
    while (orange.Slices.Count != 0)
    {
      var slice = orange.Slices.RemoveNext ();
      while (slice.HasFlesh)
      {
        Bite (slice);
      }
    }
  }

  // ...
}

Now, Fruit only needs an abstract method Accept (IFruitVisitor visitor), which Apple implements by calling visitor.VisitApple (this), and Orange implements by calling visitor.VisitOrange (this). The Accept method dispatches the method call to the right Visit... method of the IFruitVisitor, and because this happens after the person.Eat (fruit) method call is dispatched to the Person.Eat method implementation, the concept is also called double dispatch.

My, this is a nice implementation, isn’t it? The code is in the right place, we don’t need awkward type checks – by implementing the interface, the compiler will even tell us if we forget to implement the algorithm for a specific kind of fruit!

Back to the original question: Nice the Visitor-based implementation might be, but how exactly did it allow us to add a new virtual method to the Fruit hierarchy without actually changing that hierarchy?

Well, first of all, it’s not really a virtual method that is added. Actually, it’s just code that differentiates between whether it is executed for an Apple or an Orange, just like a virtual method overridden in Apple and Orange would. For brevity, let’s call that type-dependent code because it is code that depends on the type of the fruit instance.

Second, “without changing that hierarchy” is true only when the Visitor pattern is already in place in the class hierarchy. There must already be an IFruitVisitor interface, and the Fruit classes must define Accept methods.

With these clarifications, the question is now: How exactly did the Visitor pattern allow us to add new type-dependent code to the Fruit hierarchy (which implements the Accept method) without actually changing that hierarchy?

In our example, the type-dependent code is of course the implementation of the algorithms to eat the different kinds of fruit. If you compare the last listing with the first one, you can see that the VisitApple and VisitOrange methods contain more or less the same code we would have written into the overrides of a virtual BeEaten method. By using the Visitor pattern, we were therefore able to add that type-dependent code (which usually requires a virtual method) without actually touching the Fruit hierarchy. And this is what the Guru meant in the story above.

If you think about it, the Accept method defined by the pattern is nothing else than an extensibility hook in your class hierarchy for external type-dependent code. One general-purpose virtual method for others to use on their convenience.

There remains one question, of course: When to use the Visitor pattern? Should we just avoid using virtual methods at all and instead put every type-dependent code into visitor classes? Of course not. Virtual methods are still the way to go for type-dependent code that belongs into the class hierarchy itself. The Visitor pattern is for type-dependent code that belongs somewhere else. It helps you to achieve better separation of concerns by allowing you to put your code where it belongs.

And this, as I’m sure the Guru would agree, is something to meditate on.

- Fabian

Comments

uTILLIty - April 11th, 2010 at 17:43

hi Fabian, why do you use the visitor here in the first place? why not use a strategy? I see no need for double-dispatching here. The only reason in my opinion to use a visitor pattern is that the object being visited must apply logic (such as passing me to it’s child-elements), which I cannot or should not know about as visitor.

regards, uTILLIty

Fabian Schmied - April 12th, 2010 at 08:33

Hi uTILLITy,

The double dispatch is needed here because there is different code to be executed depending on whether a person is eating an apple or an orange. I could put that code into virtual BeEaten methods on the Fruit classes, but I don’t really like that from an SoC perspective.

By your suggestion, I could add a virtual member "EatingStrategy" to the Fruit class, with Apple returning a different strategy than Orange. I assume this is what you meant, right?

The strategy approch would provide a better separation of concerns than the virtual method does, but it still requires me to add the concept of eating to the Fruit class. The double dispatch approach does not require me to do this. Therefore, I consider the double dispatch approach to allow even better SoC.

As an additional argument, consider the Fruit class being defined in another library – then I just can’t add new members to it. That was the point the Guru made in the Conversations article linked above.

The Visitor pattern is not about the visited object applying logic to the visitor; it’s really only about the double dispatch. Think about it as an extension point for type-dependent logic. Once it is in place, it can be used to add new operations to the class hierarchy without touching that hierarchy.

Here are the arguments given by the GoF (Gamma et al, Design Patterns) for when to use the Visitor pattern:

– an object structure contains many classes of objects with differing interfaces, and you want to perform operations on these objects that depend on their concrete classes

=> This is the type-dependent code I talked about.

– many distinct and unrelated operations need to be performed on objects in an object structure, and you want to avoid "polluting" their classes with these operations. […]

=> This is the SoC argument.

– the classes defining the object structure rarely change, but you often want to define new operations over the structure. […]

=> This is the point about visitors being an extensibility point for a stable class hierarchy – once defined, visitors can be used to add new operations very easily even when the class hierarchy can’t be changed.

So, although the sample is – of course – overly simplified, I think the Visitor pattern does make sense here.

Tell me what you think about this.

Cheers, Fabian

Stefan Wenig - April 12th, 2010 at 12:14

A way to create virtual methods, but put them somewhere outside the class because a) they just don’t belong there (SoC) or b) you cannot modify the class.

Cool. Mixins can do that too, as I’m sure you’ve noticed. And I’ll just leave it there :)

uTILLIty - April 12th, 2010 at 14:36

Hi Fabian,

I didn’t mean adding the strategy on the fruits, but on the person. ie

Person.Eat(Fruit fruit)
{
  using (var eatingStrategy = GetEatingStrategyForFruit(fruit)) {
    eatingStrategy.Eat(fruit);
  }
}

Stefan Wenig - April 12th, 2010 at 15:04

uTILLIty,

double dispatching (DD) solves the problem, but that doesn’t mean that there aren’t other ways. Visitor is a pattern that simulates DD, and you get DD’s advantages with it. E.g., you could derive a class from Person and override the method to eat Oranges.

But there is any number of ways to solve the problem of invoking one method for apples and one for oranges. The factory/strategy approach just looks LESS obvious to a patterns-aware reader, and you have to do all the plumbing (one service per method, writing each individual factory or setting up an IoC container). A visitor can solve that a bit more elegantly using polymorphism (just once).

Stefan

Provide feedback

Saved searches

Use saved searches to filter your results more quickly