(Coursenotes for CSC 305 Individual Software Design and Development)

Visitor design pattern

Topics for today

  • Project 1 design discussion
  • Design patterns overview and composite design review
  • Visitor design pattern
  • Pattern matching and sealed types

Design patterns

Design patterns are general, re-usable solutions to commonly occurring problems within a given context in software design. They offer templates to solve problems that can be used in multiple contexts.

For example, last week we talked about the Composite design pattern. Recall that the Composite pattern is useful when you need to provide some functionality for a tree-like structure. It turns out that this “pattern” is applicable to a number of different problem contexts. Some examples that we talked about are:

  • Traversing an HTML document
  • Reading files and folders in a file system
  • Performing tree operations in a Bintree
  • Evaluating an expression tree (like you are doing in Project 1)

In 1994, a group of four authors wrote what was to become a famous book about Design Patterns (titled Design Patterns). The book describes three broad classes of Design Patterns:

  • Behavorial Patterns: identifying common communication patterns between objects and realizing these patterns
  • Structural Patterns: organizing different classes and objects to form larger structures and provide new functionality
  • Creational Patterns: provide the capability to create objects based on a required criterion and in a controlled way

The Composite pattern is billed as a “Structural” pattern, because it involves organising your modules into a tree structure (which presumably makes sense for that problem context).

Visitor design pattern

The Visitor design pattern is a “behavioural” pattern. It makes sense when the you need perform some task on all objects in a complex structure (like a graph or a tree). The underlying classes get “visited” by some code which executes on each object in the structure.

At this point, you may wonder about the difference between the Visitor pattern and the Composite pattern. It’s true, they’re similar in focus and intent. Let’s consider an example.

Example use-case of Visitor Design

(Example from Refactoring Guru)

Suppose you’re working on an app that maintains a large graph of geographical information. Each node represents an complex entity in the graph, like a city, sightseeing area, industry, shopping mall, etc. Depending on its type, each node has various details that make up its internal state, but everything is a “node” in the graph.

You’re asked to export the entire graph to some format, like XML. This is a pretty common ask: you often want to transmit data in some language-agnostic format so that different subsystems can operate on the same data.

Each different type of node in the graph will need to write out its salient details, meaning that the “export” operation looks different for each node. Moreover, the export of one node (like a SightseeingArea) might lead to the export of its other component nodes (like a Museum or a Landmark). So, like we did with Composite design, we could make use of polymorphism and recursion to implement an “export to XML” function for each type of node.

What are some drawbacks of adding the XML export behaviour to the existing graph nodes?

  • It requires us to modify an existing, fairly complex data structure that is already in production. Bugs in the new code would impact existing users.
  • The graph’s primary purpose is model geographic data. An argument can be made that an XML export function would reduce the class’s cohesion.
  • What if somewhere down the line we wanted to export the graph as JSON, another commonly used format for representing structured data? We would need to further modify the nodes in our graph, further exposing existing uses to potentially buggy behaviour, or even requiring further changes in clients to support the new change.

The Visitor pattern helps us extend our graph to give it XML export behaviour, without modifying it. It lets us adhere to the Open/closed principle (the “O” in the SOLID principles of software design).

I like to think about the Visitor design pattern as the Composite pattern, but from the outside of the object structure instead of from the inside.

A nice added benefit of not coupling your extension to the entire object structure is that you can use the Visitor pattern when some action makes sense for only some objects in the larger structure, but not all.

We’ll go over the basic structure of the Visitor pattern and then look at a real-world example.

Implementing the visitor design pattern

There are 5 pieces involved in implementing the visitor pattern

  • Elements. these are the objects that make up the complex structure for which you want to accomplish some task. E.g., nodes in your expression tree, locations in our geographical graph, etc. Ideally, the nodes in the object structure are extensions or implementations of a common Element interface.
  • accept method. The Element interface must have a method to “accept” a visitor, and each subtype of Element must implement this method.
public interface Element {
  void accept(Visitor visitor);
}
public class SightseeingArea implements Element {
  // location-specific stuff...

  public void accept(Visitor visitor) {
    visitor.visit(this);
  }
} 

If we have default methods in Java, why can’t we fully implement the accept method in the Element interface itself? Why do we need to implement it in each concrete class?

  • A Visitor interface. The Visitor has abstract (unimplemented) methods to visit each possible type of node. That is, in the geographical graph example, the Visitor might look something like this:
public interface Visitor {
    void visit(SightseeingArea node);
    void visit(Museum node);
    void visit(Landmark node);
    // ... overloaded for all types of nodes 
}
  • Concrete visitor. Now you have the machinery in place to perform some arbitrary operation on all or some nodes in an object structure. In our running example, that “arbitrary” operation is to export the node to an XML string. We can write a concrete visitor class to do this:
public class XMLExportVisitor implements Visitor {
    public void visit(SightseeingArea node) {
        // export the SightseeingArea 
    }

    public void visit(Museum node) {
        // export the Museum 
    }

    public void visit(Landmark node) {
        // export the Museum 
    }
}

What if, in a particular Visitor, I only care about visiting some types of nodes and not others? Currently, I would need to implement a bunch of “no-op” methods because I’m forced to implement them by the Visitor interface.

  • Client. With the above machinery in place, the client can kick off a visitor to perform some operation on the object structure.
Visitor visitor = new XMLExportVisitor();
for (Element current : this.locations) {
    current.accept(visitor);
}

A real-world example

A few years ago, I was conducting some analysis on the source code of a number of Java projects written by students. I wrote code to read in the code of hundreds of projects and emit some data for analysis.

Using the Eclipse Java Development Tools (JDT) API, I parsed students’ code into Abstract Syntax Trees (ASTs), and then “visited” certain nodes of interest in these resulting trees.

Because I used the JDT API to create the object structure (the AST), I only had to write the Visiting code. See this file as an example of a visitor.

In summary, the code visits MethodDeclarations and MethodInvocations, i.e., all the places methods are defined or called in the codebase, because that’s what I was interested in for that particular analysis.

Some things to note in this example are:

  • The MethodASTVisitor has state of its own. It’s a compound object in its own right that persists some state across visits (i.e., it accumulates some data about when methods were defined and when they were invoked in test cases).
  • I am not overriding the visit method for all possible types of nodes. (A pretty large list.) That’s because, unlike the examples we’ve talked about thus far, the ASTVisitor I’m extending defines empty visit methods for all the different types of nodes available, and I only need to override the ones I want to use.
  • The visit methods are returning booleans; they are not void methods. The structure of an ASTNode is such that it can be broken down further into further ASTNodes (much like a composite tree). The return value basically tells the objects whether they should “go further” with this visitor or end the path at this node. If everything returns true, then the entire AST gets visited, which may be a waste.

Visitor pattern using pattern matching

TODO: Pattern matching has shipped in Java 21.

In my personal opinion, the Visitor pattern is one of the more clunky design patterns to implement in Java. This is mostly due to a lack of expressive language constructs.

Thankfully, pattern matching, a feature common in functional languages like OCaml, is now available in Java as well.

Pattern matching

In the Visitor example above, we’ve written a Visitor interface that contains several visit method overloads, one for each type of Element we might want to visit. Then in our XMLExportVisitor, we include implementations for each visit method. This is…pretty unwieldy.

However, with pattern matching, we can write a “visitor” much more concisely, as just a function instead of an interface and a class. All we need to ensure is that our visitor knows how to visit all possible types of Elements.

The code below is using pattern matching to match the current variable with the appropriate case, depending on its type.

public static void exportXML(List<Element> elements) {
    for (Element current : elements) {
        switch (current) {
            case SightseeingArea s -> System.out.println("Visiting sightseeing area");
            case Landmark l -> System.out.println("Visiting landmark");
            case Museum m -> System.out.println("Visiting museum");
            default -> throw new IllegalStateException("Element is something I didn't expect");
            // This code won't compile unless all possible types are accounted for
        }
    }
}

The compiler requires that switch pattern matching has exhaustive type coverage. That is, there shouldn’t be a possible value where our visitor doesn’t know what to do. However, our compiler does not know all possible implementing subclasses of Element, so is always going to err on the side of caution, and refuse to compile this switch expression.

To satisfy the compiler, we’ve stuck a default at the end and thrown an exception. The default is kind like the else of that switch expression—that is, we are saying “for anything that’s not a SightseeingArea, Landmark, or Museum, throw an error.

What do you think of this strategy?

By using a default at the end, we would simply be taking that potential type error (i.e., we tried to export a thing for which no export logic is implemented), and moving it from compile time to run time. It’s always better to face type errors at compile time rather than run time, because facing them at run time involves doing nothing, crashing the program, or doing something unexpected.

So, how do we achieve type coverage without using default?

Sealed types

We can achieve this using sealed types. The idea behind sealed types is simple. We can mark a class or interface as sealed if we want to limit which classes or interfaces can extend it.

In the code below, we are sealing the Element interface, saying that it can only be implemented by the named classes.

Remember that we want compiler hints about type errors. The sealed declaration tells the compiler that “these are the only things that will ever implement this interface”. Essentially, we are “helping the compiler help us”.

public sealed interface Element permits Landmark, SightseeingArea, Museum {
  ...
}

The compiler can now be satisfied without the default case in the switch expression, because it knows that we have achieved type coverage.

This use of pattern matching is well-described in Visitor Pattern Considered Pointless — Use Pattern Switches Instead, though I disagree with the title.

I don’t think pattern matching renders the Visitor pattern “pointless”: it just changes what it looks like. As we talk about more design patterns, remember that they can exist both within and without Java-specific features like interfaces and abstract classes. Design patterns are ideas, templates to help solve or think about software problems, and they can exist without any particular programming language in mind.