Pattern Matching

Introduction

Haxe 3 comes with improved options for pattern matching. Here we will explore the syntax for different patterns using this data structure as running example:

enum Tree<T> {
    Leaf(v:T);
    Node(l:Tree<T>, r:Tree<T>);
}

Some pattern matcher basics include:

  • patterns will always be matched from top to bottom
  • the topmost pattern that matches the input value has its expression executed
  • a _ pattern matches anything, so case _: is equal to default:

Enum matching

As with haxe 2, enums can be matched by their constructors in a natural way. With haxe 3 pattern matching, this match can now be "deep":

var myTree = Node(Leaf("foo"), Node(Leaf("bar"), Leaf("foobar")));
var match = switch(myTree) {
    // matches any Leaf
    case Leaf(_): "0";        
    // matches any Node that has r = Leaf
    case Node(_, Leaf(_)): "1";
    // matches any Node that has has r = another Node, which has l = Leaf("bar")
    case Node(_, Node(Leaf("bar"), _)): "2";
    // matches anything
    case _: "3";
}
trace(match); // 2

The pattern matcher will check each case from top to bottom and pick the first one that matches the input value. If you are not too familiar with pattern matching, the following manual interpretation of each case rule might help:

  • case Leaf(_): matching fails because myTree is a Node
  • case Node(_, Leaf(_)): matching fails because the right sub-tree of myTree is not a Leaf, but another Node
  • case Node(_, Node(Leaf("bar"), _)): matching succeeds
  • case _: this is not checked here because the previous line matched

Variable capture

It is possible to catch any value of a sub-pattern by matching it against an identifier:

var myTree = Node(Leaf("foo"), Node(Leaf("bar"), Leaf("foobar")));
var name = switch(myTree) {
    case Leaf(s): s;
    case Node(Leaf(s), _): s;
    case _: "none";
}
trace(name); // foo

This would return one of the following:

  • if myTree is a Leaf, its name is returned
  • if myTree is a Node whose left sub-tree is a Leaf, its name is returned (this will apply here, returning "foo")
  • otherwise "none" is returned

It is also possible to use = to capture values which are further matched:

var node = switch(myTree) {
    case Node(leafNode = Leaf("foo"), _): leafNode;
    case x: x;
}
trace(node); // Leaf(foo)

Here, leafNode is bound to Leaf("foo") if the input matches that. In all other cases, myTree itself is returned: case x works similar to case _ in that it matches anything, but with an identifier name like x it also binds the matched value to that variable.

Structure matching

It is now also possible to match against the fields of anonymous structures and instances:

var myStructure = { name: "haxe", rating: "awesome" };
var value = switch(myStructure) {
    case { name: "haxe", rating: "poor" } : throw false;
    case { rating: "awesome", name: n } : n;
    case _: "no awesome language found";
}
trace(value); // haxe

Note that in the second case, we bind the matched name field to identifier n if rating matches "awesome". Of course you could also put this structure into the Tree from the previous example and combine structure and enum matching.

A limitation with regards to class instances is that you cannot match against fields of their parent class.

Array matching

Arrays can be matched on fixed length:

var myArray = [1, 6];
var match = switch(myArray) {
    case [2, _]: "0";
    case [_, 6]: "1";
    case []: "2";
    case [_, _, _]: "3";
    case _: "4";
}
trace(match); // 1

This will trace 1 because array[1] matches 6, and array[0] is allowed to be anything.

Or patterns

The | operator can be used anywhere within patterns to describe multiple accepted patterns:

var match = switch(7) {
    case 4 | 1: "0";
    case 6 | 7: "1";
    case _: "2";
}
trace(match); // 1

If there's a captured variable in an or-pattern, it must appear in both its sub-patterns.

Guards

It is also possible to further restrict patterns with the case ... if(condition): syntax:

var myArray = [7, 6];
var s = switch(myArray) {
    case [a, b] if (b > a):
        b + ">" +a;
    case [a, b]:
        b + "<=" +a;
    case _: "found something else";
}
trace(s); // 6<=7

Note how the first case has an additional guard condition if (b > a). It will only be selected if that condition holds, otherwise matching continues with the next case.

Match on multiple values

Array syntax can also be used to match on multiple values:

var s = switch [1, false, "foo"] {
    case [1, false, "bar"]: "0";
    case [_, true, _]: "1";
    case [_, false, _]: "2";
}
trace(s); // 2

This is quite similar to usual array matching, but there are differences:

  • the number of elements is fixed, so patterns of different array length will not be accepted
  • it is not possible to capture the switch value in a variable, i.e. case x is not allowed (case _ still is)

Exhaustiveness checks

The compiler ensures that you do not forget a possible case for non value-only switches:

switch(true) {
    case false:
} // This match is not exhaustive, these patterns are not matched: true

Useless pattern checks

In a similar fashion, the compiler detects patterns which will never match the input value:

switch(Leaf("foo")) {
    case Leaf(_)
       | Leaf("foo"): // This pattern is unused
    case Node(l,r):
}
version #19874, modified 2014-01-04 14:03:14 by AndyLi