[Bro-Dev] case-insensitive patterns

Johanna Amann johanna at icir.org
Fri Jun 29 12:14:21 PDT 2018

On Fri, Jun 29, 2018 at 12:00:30PM -0700, Vern Paxson wrote:
> Once I wound up monkeying around with the internals of the pattern-matching
> code (to fix leaks, because Johanna [correctly] pushed back on adding the
> &/| operators for general use if they leaked, which an old ticket indicated
> they would) ... I thought what-the-heck, it's time for supporting
> case-insensitive patterns.

Thanks a lot for searching the memory leaks - I know that has been a pain.

> This turned out to be tricky to implement, as I gleaned from talking with
> Seth about an approach he had tried a while back but abandoned.  But I now
> have it working.

This is great - case-insensitive pattern have been something that I wanted
to have for a long time.

>   You can achieve the same functionality for a subpattern enclosed in
>   parentheses by adding "+i" to the open parenthesis, optionally followed
>   by whitespace.  So for example "/foo|(+i bar)/" will match "BaR", but
>   not "FoO".

Hum. Is there a reason why we come up with our own syntax for this? Other
implementations already have this using a just slightly different syntax.

To do the same in perl, you would use "/foo|(?i:bar)/". It also supports
turning off case insensitivity for part of a pattern by doing
"/foo|(?-i:bar)/". Furthermore you can also switch it on for the rest of
the pattern by doing (?i) - after that everything is insensitive.
https://perldoc.perl.org/perlre.html#Extended-Patterns has more details

Python supports the exact same syntax. And - to make things easier for
users I think it would be way nicer if we just also would do this.

> The funky (+i ...) syntax isn't meant for general user consumption (though
> it's okay if a user wants to use it directly), but rather is how I implemented
> /pattern/i functionality.

And this is fine - but if we support it I would actually prefer just
making it explicit and doing it like everyone else :)


More information about the bro-dev mailing list