[Bro-Dev] Broker data store use case and questions

Mon May 14 07:12:43 PDT 2018

On 5/11/18 1:38 PM, Azoff, Justin S wrote:
> 
>> On May 11, 2018, at 10:13 AM, Jon Siwek <jsiwek at corelight.com> wrote:
>>
>>
>> There's no check against the local cache to first see if the key exists
>> as going down that path leads to race conditions.
> 
> What sort of race conditions?

By "local cache", I mean the data store "clone" here.  And one race with 
checking for existence in the local clone could look like:

(1) master: delete an expired key, "foo", send notification to clones
(2) clone: check for existence of key, "foo" and find it exists locally, 
then suppress further logic based on that
(3) clone: receive expiry notification for key "foo"

In that way, you can miss an (re)insertion that should have taken place 
if the query/insertion were together in sequence directly on the master 
data set.

> Things are a bit better off now in that we can use a short lived cache, since the cache doesn't need to be the actual data store anymore like the old known hosts set was.

A short-lived cache, separate from the data store, still has problems 
like the above: there can be times where the local cache contains the 
key and the master store does not and so you may miss some (re)insertions.

The main goal I had when re-writing these was correctness: I can't know 
what network they will run on, and so don't want to assume it will be ok 
to miss an event here or there because "typically those should be seen 
frequently enough that it will get picked up soon after the miss".

If we can optimize the scripts that ship w/ Bro while still maintaining 
correctness, that would be great, else I'd rather sites decide for 
themselves what trade-offs are acceptable and write their own scripts to 
optimize for those.

- Jon