[Bro-Dev] Pattern matching for the Bro language

Matthias Vallentin vallentin at icir.org
Wed Aug 19 08:43:11 PDT 2015


TL;DR:

    function f() : any;

    local result = "";
    switch( f() )
      {
      case addr:
        if ( x in 10.0.0.0/8 )
          result = "got it!";
      case string:
        result = "f() failed: " + x;
      }

I want to propose introducing pattern matching for the Bro language.
Pattern matching is a powerful concept particularly available in
functional languages, like Haskell, ML, Erlang, Rust, you name it. It
enables typesafe dispatching based on the type of a value. Other
languages often can go beyond type-based dispatching and also enable
"value" dispatch. We *kinda* have this with the when statement in an
asynchronous form, which monitors a given expression value, and whenever
the operands change, the expression is re-evaluated.

But, let's get back to type-based dispatch and "any". The "any" type is
really just a bolt-on fix for the lack of a more sophisticated type
system. We use (and abuse) it anywhere where we need polymorphism and
want to bypass the type system. Today, Bro doesn't have generic
programming facilities besides "any". I hope this will change in the
future; introducing pattern matching is the first step in this
direction.

In the future, I believe that in Bro we see more and more asynchronous
operations, in particular with the proliferation of Broker. This
requires better language support. When users store data remotely and
need to wait for answer. The asynchrony often introduces sum types:
either the result comes back or an error occurs. The above example is
such a sum type: either an addr or a string. If "x" has neither type,
Bro would raise an error---at runtime. Here's a another example:

    function lookup(key: string) : any;

    when ( local x = lookup("key") )
      {
      local result = "";
      switch( x )
        {
        case addr:
          if ( x in 10.0.0.0/8 )
            result = "contained";
        case string:
          result = "error: lookup() failed: " + x;
        }
      }

When we ask a store for data, the runtime doesn't know the type until it
gets a result back. Because there can be multiple return types, "switch"
provides a means to extract the value in a type-safe manner.

Some languages (Ruby comes to mind) design switch as an expression,
which would allow constructs like:

      local result = switch( x )
        {
        case T:
        case U:
        };

Personally, I like this functional treatment, but C-seasoned folks may
have a harder time with it.

If you have any thoughts on this, please chime in.

    Matthias


More information about the bro-dev mailing list