[Bro-Dev] 0MQ security considerations
Gilbert Clark
gc355804 at ohio.edu
Thu Jul 28 10:50:42 PDT 2011
Hi:
Inline:
>> Originally, we were discussing using 0mq, which uses a message-based
>> architecture. This struck me as a very clean way to segment a program
>> into threads, and would logically extend rather well to cover other
>> things (e.g. IPC). As such, I borrowed that model.
> I like the message-passing model as well. How do you use the term
> "thread?" Do you mean a hardware thread (managed by the OS) or a
> virtual/logic thread (a user-space abstraction)? I am asking because
> because a general there should be a (close to) 1:1 ratio between
> available cores and number of user-level threads, mainly to avoid
> thrashing and increase cache performance. With I/O-bound applications
> this is of course less of an issue, but nonetheless a prudent software
> engineering practice in the manycore era.
>
Whichever model pthread happens to use :) I think the implementation
might be slightly platform-dependent.
That said, keep in mind that some libraries (e.g. DataSeries) actually
spawn additional worker threads. This makes it very difficult to place
a hard limit on the number of threads that exist within a bro process.
If this does turn out to be an issue, it may be time to look at
fork()'ing the loggers and / or using some kind of remote logging scheme.
>> Because there's a large degree of complexity involved with ensuring
>> any individual event can be processed on any thread, especially given
>> that log / flush / rotate messages have possibly complex ordering
>> dependencies to deal with, and further given that a log writer (or, in
>> bro's case, most of the logwriter-related events) should spend the
>> majority of its time blocking for IO, I don't necessarily agree that
>> logging stuff would be a good candidate for task-oriented execution.
> You bring up a good point, namely blocking I/O, which I haven't thought
> of. Just out of curiosity, could all blocking operations replaced with
> their asynchronous counterparts?
For the ASCII and the DataSeries loggers, yes; the latter seems to
already do this (one of the worker threads is an output thread).
> I am just asking because I am using
> Boost Asio in a completely asynchronous fashion. This lends itself well
> to a task-based architecture with asynchronous "components," each of
> which have a task queue that accumulates non-interfering function calls.
> Imagine that each component has some dynamic number of threads,
> depending on the available number of cores.
Okay.
> Let's assume there exists an asynchronous component for each log
> backend. (Sorry if I am misusing terminology, I'm not completely
> up-to-date regarding the logging architecture.)
I think I follow you.
> If the log/flush/rotate
> messages are encapsulated as a single task, then the ordering issues
> would go away, but you still get a "natural" scale-up at event
> granularity (assuming events can arrive out-of-order) by assigning more
> threads to a component.
I don't understand this bit. Say we have 10 things to log: {A, B, C,
... J}, and we queue 8 messages before we flush + rotate (for the sake
of argument)
Log+Flush+Rotate task = {A, B, C, D} ---> Thread 1
Log+Flush task = {E, F, G, H} ---> Thread 2
Log+Flush+Finish task = {I, J} ---> Thread 3
In this case, it seems possible that thread 3 could complete before
threads 1 and 2, unless we forced tasks to lock the log file upon
execution. . . but if we did that, I think the locking order would
become less predictable as the system's load increased, leading to oddly
placed chunks of log data and / or dropped messages if the file were to
be closed before a previous rotate message made it to the logger.
> Does that make sense? Maybe you use this sort of
> architecture already?!
Kind of. Let me throw together a diagram and send it out; it's
something that should probably end up in the documentation anyway :)
> My point is essentially that there is a
> difference between tasks and events that have different notions of
> concurrency.
>
I need to do some more reading before I'd be prepared to offer a
coherent argument here.
>> Re: task-oriented execution for bro in general: this seems like it is
>> already accomplished to a large degree by e.g. hardware that splits
>> packets by the connection they belong to and routes them to the
>> appropriate processing node in the bro cluster.
> Yeah, one can think about it this way. The only thing that gives me
> pause is that the term "task" has a very specific, local meaning in
> parallel computation lingo.
>
I don't claim to be a parallel computation expert, so this does not
surprise me :)
I was more attempting to illustrate the parallel between bro's existing
cluster architecture and a more traditional task-oriented model (as I
understand it) than I was trying to argue that packets should
necessarily be classifiable as "tasks" in the strictest sense of the word.
>> If we wanted to see aggregate performance gains, I guess we could
>> write ubro: micro scripting language, rules are entirely resident in a
>> piece of specialized hardware (CUDA?), processes only certain types of
>> packet streams (thus freeing other general-purpose bro instances to
>> handle other stuff).
> Nice thought, that's something for HILTI where available hardware is
> transparently used by the execution environment. For example, if a
> hardware regex matcher is available, the execution environment offloads
> the relevant instructions to the special card but would otherwise use
> its own implementation.
>
Ah, right then.
Thanks,
Gilbert
More information about the bro-dev
mailing list