[Bro-Dev] 0MQ security considerations

Sat Jul 30 22:30:27 PDT 2011

On 7/30/2011 6:56 PM, Matthias Vallentin wrote:
> (I read Robin's follow up and have no objections with keeping the
> hardware thread model instead of a task-based worker thread model.
> My comments below are just for the record.)
>
>> [..] this brings up a more general problem: what if we use a library
>> that doesn't offer us the ability to do this?
> That's a big issue. Some research at the Berkeley ParLab tried to
> address this by providing a broker library; sort of a meta scheduler
> that seeks to make sure cross-library calls do not hurt concurrency.
> Alas, I cannot remember the name of the project but if you are
> interested I can find it out.
>

Sure, if it wouldn't be too much trouble :)  I'd love to see how they 
manage to do that.

> Cool, that's a nice diagram.

Thanks :)

> Having similar ones for the big parts of
> Bro is a valuable resource for developers. Some quick
> questions/comments:
>
>      - In the LogMgr description, what's a log stream object?

Oh, yeah.  That probably needs to go in there.

To answer the question: a bro log stream is defined by three things -- a 
path (where do I write), a record type (what do I write), and a writer 
type (how do I write).  The stream object holds a little bit of state 
("am I pushing my logs through a remote serializer?"  "is this log 
stream enabled or disabled?"  etc).  Also, since the types used by the 
log writers differ somewhat from the types used by bro internally, the 
stream holds some information used to help with that mapping.

That said, stream objects could (and probably eventually should) be 
rolled into the LogEmissary type.

>      - I'm not sure if I understand the crossed-out CONTEXT BOUNDARY
>        text. Does this essentially mean that the queue interface can (in
>        the future) adapt based on how the distributed system is
>        configured? I.e., if reader and writer are part of the same
>        program, it is some sort of IN_PROC queue, and for IPC and
>        networking it will transparently serialize the messages?

Yeah, that was the intent.  Wasn't really sure quite how to illustrate 
that; any thoughts on a better way?

>      - What is the Pull FIFO for? Does the LogEmissary receive
>        feedback from the LogWriter?

Yes.  Currently, that feedback is limited to error messages, but there 
are other messages planned as well.

>      - Just out of curiousity, what type of thread-safe queue is it?
>        Single-writer/single-reader? Single-writer/multiple-reader? There
>        are zillions of thread-safe queue implementations out there, each
>        optimized for some different use case!

I don't know if I'd really call this one "optimized" :)  The queue was 
thrown together pretty quickly; it's largely targeted for single 
producer / single consumer, but I believe multiple producer / single 
consumer should work as well.

Trying to use multiple consumers with this queue would likely result in 
some kind of universe-ending quantum event involving the LHC, the 
deflector dish on the USS Enterprise, and a relatively cute kitten with 
gray fur and black tiger stripes.

Alternatively, the program could just crash and / or deadlock, but 
that's not nearly as much fun to contemplate.

Regardless, verifying the above and documenting the properties of said 
queue would probably be an excellent idea.  Thanks :)

> While we're at it, is the
>        LogEmissary to LogWriter multiplicity 1:1 or 1:n?
>

1:1; it wouldn't be difficult to make it otherwise, but I haven't been 
able to think of a good reason to do so.

>      - About BasicThread: it seems there is one channel for both control
>        and data messages, i.e., the queue can include a special terminate
>        event to shutdown the BasicThread. Are these control messages
>        filtered out in the base class (BasicThread) code before the
>        message is passed to the child (E.g. LogWriter) code?
>

More or less.  An additional piece of the message (not shown on the 
diagram. . . not quite sure how to illustrate this yet) involves a 
reference to the LogWriter which is used to process the message.  For 
example:

class FlushMessage : public EventMessage {
public:
     FlushMessage(LogWriter &ref) : _ref(ref) { }
     void process() { ref.flush(); }
private:
     LogWriter &_ref;
};

For the purposes of serialization, the reference can be expressed as a 
<LogWriter type, LogWriter path> pair.

Then, when BasicThread pulls the event off of the queue and calls 
process(), the appropriate LogWriter function is called.

For a terminate message, then:

class TerminateThread : public MessageEvent
{
public:
     TerminateThread(ThreadInterface &ref)
     : ref(ref) { }

     bool process()
         {
         ref.stop();
         return true;
         }

private:
     ThreadInterface &ref;
};

where BasicThread inherits from ThreadInterface, and thus the 
BasicThread is stopped when the event is processed.

I'm always open to better ideas :)

>> Feedback / corrections / "THAT'S A STUPID MODEL!" are always
>> appreciated.
> I like the architecture (+1 with your words ;-) and am looking forward
> to see it in action. What I am really interested in are profiling
> benchmarks (very easy with Google perftools) to see where the non-I/O
> bottlenecks occur.

Probably going to start working on that after a bit; there's a crossover 
cable hooked up between two development machines (thanks Robin!) and a 
few GB of traces ready to go, but took a break to address testing and 
tool-related stuff.

If you'd happen to know of a good way to measure contention on a Linux 
system, I'd love to hear about it; I'm planning on writing a few stap 
scripts to help out here, but it'd save me a lot of time if there were 
something that existed already and seemed to work pretty well.

> Sorry if I have missed it, but do we use now 0mq as messaging
> middle-layer or is all message passing based on custom code?

All custom code.  It didn't seem to make sense to require 0mq as a 
dependency when the logging infrastructure was the only thing that would 
use it.

--Gilbert