[Bro-Dev] New logging architecture

Fri Jul 1 16:38:40 PDT 2011

Hey folks:

I'm working on building threading into the logging framework in parallel 
with the work I'm doing on DataSeries.  I believe I have a plan, and was 
hoping to get some input from other folks on the list.

At the moment, a log message:

*) Is generated deep within Bro, eventually finding its way to LogMgr::Write
*) In LogMgr::Write, the following happens:
    > Checks that an appropriate LogMgr::Stream exists for the writer
    > Checks that any relevant LogWriter has been properly initialized
    > Applies any necessary filters
*) The log message is turned into a LogVal **
    > In the case of a remote filter, the LogVal ** is spirited away to 
serializations unknown
    > In the case of a local filter, the LogVal ** is passed along to 
the appropriate LogWriter::Write for processing.

So, to change this to support threading, I was planning to turn 
LogMgr::Stream into a self-contained object with two 0mq message-passing 
sockets attached:

*) One write-only (PUSH) 0mq socket created by the parent LogMgr, that 
could be used to send messages to the Stream object (for example, 
something like LOG_WRITE)
*) One read-only (PULL) 0mq socket created by the child Stream object, 
which would be used to receive the messages the LogMgr sent.

After the LogMgr::Stream was created, the PUSH 0mq socket would be the 
only means with which to communicate with it (in order to avoid needing 
evil things like semaphores / condition variables).  This means that the 
LogMgr would need to generate and pass messages to the LogMgr::Stream 
object.

Thus, the LogWriter initialization bit and the LogWriter logging bit 
would both happen within the context of the Stream thread (and as a 
result of writing an appropriate message to the LogMgr::Stream's PUSH 
socket).

I figure the LogMgr would need to be able to generate (at a minimum) the 
following types of messages:

*) EnableStream
*) DisableStream
*) StreamInit
*) StreamFinish
*) RotateLog
*) LogMessage

(Note: as a shortcut, we could probably build a fast-track 
LogMessageInProc type that passed a pointer to the data to log, rather 
than encapsulating everything when passing within a single process... 
but I figure that's an optimization, and could probably be dealt with 
later if it proves to be necessary).

Anyway, I figure that, after this point, we'd be close to having an 
entirely self-contained logging infrastructure; so long as the message 
format was standardized (and I do have a rough draft of a message format 
which I'd be happy to send out, assuming the above isn't too confusing 
and seems technically sound), anything that spoke the correct message 
format could act as a logger for Bro.

So. . . thoughts?  Does that make sense?  What's bad / broken about 
doing things this way?

Thanks,
Gilbert Clark