[Bro] Bro's escaping of non-printable characters behaves unexpected

Johanna Amann johanna at icir.org
Tue Feb 17 18:12:22 PST 2015


Hello Paul,

I think the reason that the ascii writer of the logging framework of Bro
does not support arbitrary binary data is, that it was conceived as a
framework for writing human-readable log files, not arbitrary binary data.

If you want to write binary data to log files, I would recommend just
base64-encoding it before using the encode_base64 bif.

If you are ok with just using the standard methods for writing to files
outside of the logging framework, you can put them into binary mode, as
you probably are aware.

Johanna

On Tue, Feb 17, 2015 at 05:15:31PM -0800, Paul Pearce wrote:
> Hello,
> 
> That was a poor example, as it used \0 which is special cased by the
> bro escape functionality.
> 
> This problem also extends beyond non-printable to non-ascii (unicode)
> characters. Here's another example with a unicode character for the
> registered sign ® (\xc2\xae).
> 
> ----
> $ bro -e 'event bro_init() { print "foo \xc2\xae bar \\xc2\\xae baz"; }'
> 
>  foo \xc2\xae bar \xc2\xae baz
> ----
> 
> If you decide to revisit bro's escape functionality, I'd also point
> out that Bro special casing NUL, DEL, and ord(char) <= 26 creates
> difficulty when decoding bro output in other languages (e.g. python).
> 
> Besides having different encodings based on the character range, the
> ^[A-Z] format causes the same ambiguous output issue above, but with ^
> instead of \. Example:    bro -e 'event bro_init() { print "foo \16
> bar ^N baz"; }'
> 
> If this is desired behavior, I might suggest an configuration option
> that allows ascii log generation using a standard representation for
> non-ascii/non-printable characters?
> 
> On Tue, Feb 17, 2015 at 4:20 PM, Paul Pearce <pearce at cs.berkeley.edu> wrote:
> > Hello everyone,
> >
> > I'm encountering a problem where I am unable to reconstruct original
> > inputs from bro log files. This example summarizes the problem:
> >
> > ----
> > $ bro -e 'event bro_init() { print "foo\x00bar\\0baz"; }'
> >
> > foo\0bar\0baz
> > ----
> >
> > This makes recovering the original input impossible, as you can't
> > differentiate between the escaped null and the ascii characters '\'
> > and '0'.
> >
> > If bro was going to implicitly escape the string, I would have
> > expected the following output:
> >
> > ----
> > $ bro -e 'event bro_init() { print "foo\x00bar\\0baz"; }'
> >
> > foo\0bar\\0baz
> > ----
> >
> > A workaround would be to output files in raw mode, however I am
> > encountering this problem with logs generated via the logging
> > framework, which supports no such option (AFAIK).
> >
> > Another workaround would be to substitute '\' for '\\' in all such
> > outputs before handing them to the logging framework, but that
> > solution seems... sub par.
> >
> > My read here is that bro's auto-escaping functionality should be
> > changed to allow reconstruction of inputs in all cases.
> >
> > Thanks.
> > -Paul
> 
> _______________________________________________
> Bro mailing list
> bro at bro-ids.org
> http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro
> 
> 
> 


More information about the Bro mailing list