[Bro] Patterns and Word Boundaries
Lloyd Brown
lloyd_brown at byu.edu
Thu Oct 22 08:05:18 PDT 2015
Hopefully this isn't too simplistic of a question, but I'm just getting
started with Bro.
In the text pattern syntax for Bro [1], is there an easy way to define
word boundaries, similar to how some of the RegEx dialects use '\b',
'\<', '\>', etc.? [2]
I'm trying to match for specific strings in a data stream. For example,
the word "nmap". I'm trying several approaches, based on past RegEx
knowledge, and I'm having trouble coming up with a single pattern that
would handle it all. Example bro test script attached; hopefully it's
clear.
Fundamentally, is there a syntax reference for pattern matching, or does
it conform to a commonly known dialect (eg. POSIX-style RegEx, or PCRE
RegEx)?
[1] https://www.bro.org/sphinx/scripting/index.html#pattern
[2] http://www.regular-expressions.info/wordboundaries.html
--
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu
-------------- next part --------------
event bro_init() {
local testcases = set(
"nmap", #Should match something
"test nmap", #Should match something
"nmap test", #Should match something
"test nmap test", #should match something
"unmapped_entries", #Should NOT match any of the patterns
"test\tnmap", #Should match something
"nmap\ttest", #Should match something
"test\tnmap\ttest" #Should match something
);
local nmap_patterns = vector(
/ nmap /, #Works, but what if it's non-space whitespace, eg '\t'?
/^nmap /,
/ nmap$/,
/^nmap$/,
/\bnmap\b/, #doesn't seem to match word boundaries as expected
/\<nmap\>/, #doesn't seem to match word boundaries as expected
/[ \t]nmap$/, #this works, but I have to anticipate which whitespace chars will be used
/^nmap[ \t]/, #this works, but I have to anticipate which whitespace chars will be used
/[ \t]nmap[ \t]/ #this works, but I have to anticipate which whitespace chars will be used
#I wanted to try this one involving negative lookahead and negative lookbehind, but it won't even compile
#/(?<!\s)nmap(?>!\s)/ #probably won't work; not sure if \s means what I think, and negative lookarounds are hard to get right...
);
for (testcase in testcases) {
print fmt("Testcase: \"%s\"", testcase);
for (pi in nmap_patterns) {
if ( nmap_patterns[pi] in testcase ) {
print fmt(" Pattern: %s - Matched", nmap_patterns[pi]);
} else {
print fmt(" Pattern: %s - Did NOT match", nmap_patterns[pi]);
}
}
}
}
More information about the Bro
mailing list