[Bro-Dev] Broker raw throughput

Dominik Charousset dominik.charousset at haw-hamburg.de
Wed Mar 9 02:23:28 PST 2016


>> You could additionally try to tweak
>> caf#middleman.max_consecutive_reads, which configures how many
>> new_data_msg messages a broker receives from the backend in a single
>> shot. It makes sense to have the two separated, because one configures
>> fairness in the scheduling and the other fairness of connection
>> multiplexing.
> 
> Good to know about this tuning knob. I played with a few values, from 1
> to 1K, but could not find an improvement by tweaking this value alone.
> Have you already performed some measurements to find the optimal
> combination of parameters?

I don't think there is an optimal combination for all use cases. You are always trading between fairness and throughput. The question is whether your application needs to stay responsive to multiple clients or if your workload is some form of non-interactive batch processing.

Any default value is arbitrary at the end of the day. As long as messages are distributed more-or-less evenly among actors and no actor receives hundreds of messages between scheduling cycles, the parameters don't matter anyway.

>> Tackling the "many small messages problem" isn't going to be easy. CAF
>> could try to wrap multiple messages from the network into a single
>> heap-allocated storage that is then shipped to an actor as a whole,
>> but this optimization would have a high complexity.
> 
> A common strategy to reduce high heap pressure involves custom
> allocators, and memory pools in particular. Assuming that a single actor
> produces a fixed number of message types (e.g., <= 10), one could create
> one memory pool for each message type. What do you think about such a
> strategy?

This is exactly what CAF does. A few years ago, this was absolutely necessary to get decent performance. Recently, however, standard heap allocators were getting much better (at least on Linux). You can build CAF with --no-memory-management to see if it makes a difference on BSD.

The optimization I meant is to not wrap each integer in its own message object, but rather make one message which then contains X integers that are transparently interpreted by the receiver as X messages. But this requires some form of "output queue" or lookahead mechanism.

    Dominik


More information about the bro-dev mailing list