From robin at icir.org Wed Nov 1 14:23:33 2017 From: robin at icir.org (Robin Sommer) Date: Wed, 1 Nov 2017 14:23:33 -0700 Subject: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b) In-Reply-To: References: <201710271803.v9RI3oSQ001411@bro-ids.icir.org> <20171031181607.GB26741@icir.org> Message-ID: <20171101212333.GA29502@icir.org> On Tue, Oct 31, 2017 at 22:35 +0000, you wrote: > My thought was they can conceptually still be used for the same type > of stuff: data sharing and offloading other misc. > analysis/calculation. Yeah, agree that we want such nodes, however I would like to switch away from the proxy name. "proxy" had a very specific meaning with the old communication system and calling the new nodes the same would be confusing I think. > I?m worried I missed a previous discussion on what people expect the > new cluster layout to look like or maybe just no one has put forth a > coherent plan/design for that yet? Justin, correct me if I'm wrong, but I don't think this has ever been fully fleshed out. If anybody wants to propose something specific, we can discuss, otherwise I would suggest we stay with the minimum for now that replicates the old system as much as possible and then expand on that going forward. > Yeah, could do that, but also don't really see the problem with > exporting things individually. At least that way, the topic strings > are guaranteed to be correct in the generated docs. Yeah, that's true, I was mostly thinking from the perspective of having a concise API in the export section. But either way seems fine. > the broadcast. At least I don?t think there?s another way to send > directed messages (e.g. based on node ID) in Bro?s current API, maybe > I missed it? Ah, I misunderstood the purpose of these messages. If I remember right we can send direct messages at the C++ level and could expose that to Bro; or we could have nodes subscribe to a topic that corresponds to their node ID. But not sure either would make it much different, so nevermind. > I might generally be missing some context here: I remember broker > endpoints originally being able to self-identify with the friendly > names, so these new hello/bye events wouldn?t have been needed, but it > didn?t seem like that functionality was around anymore. I actually don't remember. If we had it, not sure what happened to it. Robin -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From jazoff at illinois.edu Wed Nov 1 16:11:11 2017 From: jazoff at illinois.edu (Azoff, Justin S) Date: Wed, 1 Nov 2017 23:11:11 +0000 Subject: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b) In-Reply-To: <20171101212333.GA29502@icir.org> References: <201710271803.v9RI3oSQ001411@bro-ids.icir.org> <20171031181607.GB26741@icir.org> <20171101212333.GA29502@icir.org> Message-ID: <2336CB3B-DAA8-474F-83FC-252A59598DC4@illinois.edu> > On Nov 1, 2017, at 5:23 PM, Robin Sommer wrote: > > Justin, correct me if I'm wrong, but I don't think this has ever been > fully fleshed out. If anybody wants to propose something specific, we > can discuss, otherwise I would suggest we stay with the minimum for > now that replicates the old system as much as possible and then expand > on that going forward. My design for a new cluster layout is multiple data nodes and multiple logger nodes using the new RR and HRW pools Jon added. It's not too much different from what we have now, just instead of doing things like statically configuring that worker-1,3,5,7 connects to proxy-1 and worker-2,4,6,8 connect to proxy-2, workers would connect to all data nodes and loggers and use round robin/hashing for distributing messages. We have preliminary support for multiple loggers in broctl now, it just uses the static configuration method, so if you are running two and one process dies, half the workers have no functioning logger. The node.cfgs would look something like ## Multiple node cluster with redundant data/logger nodes # manager - 1 [manager-1-logger] host = manager1 type = logger [manager-1-data] host = manager1 type = data lb_procs = 2 # manager - 2 [manager-2-logger] host = manager2 type = logger [manager-2-data] host = manager2 type = data lb_procs = 2 # worker 1 [worker-1] host = workerN type = worker lb_procs = 16 ... # worker 4 [worker-4] host = worker4 type = worker lb_procs = 16 ## 2(or more) node cluster with no SPOF: # node - 1 [node-1-logger] host = node1 type = logger [node-1-data] host = node1 type = data lb_procs = 2 [node-1-workers] host = worker1 type = worker lb_procs = 16 # node - 2 [node-2-logger] host = node2 type = logger [node-2-data] host = node2 type = data lb_procs = 2 [node-2-workers] host = worker2 type = worker lb_procs = 16 Replicating the old system initially sounds good to me, just as long as that doesn't make it harder to expand things later. The logger stuff should be the easier thing to change later since scripts don't deal with logger nodes directly and the distribution would be handled in one place inside the logging framework. Multiple data nodes is a little harder to add in later since that requires script language support and script changes for routing events across nodes. I think for the most part the support for multiple data nodes comes down to 2 functions being required: - a bif/function for sending an event to a data node based on the hash of a key. - This looks doable now with the HRW code, it's just not wrapped in a single function. - a bif/function for efficiently broadcasting an event to all other workers (or data nodes) - If the current node is a data node, just send it to all workers - otherwise, round robin the event to a data node and have it send it to all workers minus the current node. If &synchronized is going away script writers should be able to broadcast an event to all workers by doing something like Cluster::Broadcast(Cluster::WORKERS, event Foo(42)); This would replace a ton of code that currently uses things like worker2manager_events+manager2worker_events+ at if ( Cluster::local_node_type() == Cluster::MANAGER ) ? Justin Azoff From seth at corelight.com Thu Nov 2 06:59:49 2017 From: seth at corelight.com (Seth Hall) Date: Thu, 02 Nov 2017 09:59:49 -0400 Subject: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b) In-Reply-To: <20171101212333.GA29502@icir.org> References: <201710271803.v9RI3oSQ001411@bro-ids.icir.org> <20171031181607.GB26741@icir.org> <20171101212333.GA29502@icir.org> Message-ID: <0BAC98C3-B8B4-47FA-9B6D-628A99DB591D@corelight.com> On 1 Nov 2017, at 17:23, Robin Sommer wrote: > Yeah, agree that we want such nodes, however I would like to switch > away from the proxy name. "proxy" had a very specific meaning with the > old communication system and calling the new nodes the same would be > confusing I think. Agreed. There has been so much confusion over the "proxy" name that it's best to just get rid of it. Especially considering that the *exact* tasks those processes will be taking on will be slightly different. > Justin, correct me if I'm wrong, but I don't think this has ever been > fully fleshed out. If anybody wants to propose something specific, we > can discuss, otherwise I would suggest we stay with the minimum for > now that replicates the old system as much as possible and then expand > on that going forward. Agreed on this too. Some of these changes sound like they could take a while to prototype and figure out how they would be effectively used. .Seth -- Seth Hall * Corelight, Inc * www.corelight.com From jsiwek at illinois.edu Thu Nov 2 10:22:31 2017 From: jsiwek at illinois.edu (Siwek, Jon) Date: Thu, 2 Nov 2017 17:22:31 +0000 Subject: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b) In-Reply-To: <2336CB3B-DAA8-474F-83FC-252A59598DC4@illinois.edu> References: <201710271803.v9RI3oSQ001411@bro-ids.icir.org> <20171031181607.GB26741@icir.org> <20171101212333.GA29502@icir.org> <2336CB3B-DAA8-474F-83FC-252A59598DC4@illinois.edu> Message-ID: > On Nov 1, 2017, at 6:11 PM, Azoff, Justin S wrote: > > - a bif/function for efficiently broadcasting an event to all other workers (or data nodes) > - If the current node is a data node, just send it to all workers > - otherwise, round robin the event to a data node and have it send it to all workers minus the current node. In the case of broadcasting from a worker to all other workers, the reason why you relay via another node is only because workers are not connected to each other? Do we know that a fully-connected cluster is a bad idea? i.e. why not have a worker able to broadcast directly to all other workers if that?s what is needed? > If &synchronized is going away script writers should be able to broadcast an event to all workers by doing something like > > Cluster::Broadcast(Cluster::WORKERS, event Foo(42)); > > This would replace a ton of code that currently uses things like worker2manager_events+manager2worker_events+ at if ( Cluster::local_node_type() == Cluster::MANAGER ) The successor to &synchronized was primarily intended to be the new data store stuff, so is there a way to map what you need onto that functionality? Or can you elaborate on an example where you think this new broadcast pattern is a better way to replace &synchronized than using a data store? - Jon From jazoff at illinois.edu Thu Nov 2 10:58:31 2017 From: jazoff at illinois.edu (Azoff, Justin S) Date: Thu, 2 Nov 2017 17:58:31 +0000 Subject: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b) In-Reply-To: References: <201710271803.v9RI3oSQ001411@bro-ids.icir.org> <20171031181607.GB26741@icir.org> <20171101212333.GA29502@icir.org> <2336CB3B-DAA8-474F-83FC-252A59598DC4@illinois.edu> Message-ID: > On Nov 2, 2017, at 1:22 PM, Siwek, Jon wrote: > > >> On Nov 1, 2017, at 6:11 PM, Azoff, Justin S wrote: >> >> - a bif/function for efficiently broadcasting an event to all other workers (or data nodes) >> - If the current node is a data node, just send it to all workers >> - otherwise, round robin the event to a data node and have it send it to all workers minus the current node. > > In the case of broadcasting from a worker to all other workers, the reason why you relay via another node is only because workers are not connected to each other? Do we know that a fully-connected cluster is a bad idea? i.e. why not have a worker able to broadcast directly to all other workers if that?s what is needed? Mostly so that workers don't end up spending all their time sending out messages when they should be analyzing packets. >> If &synchronized is going away script writers should be able to broadcast an event to all workers by doing something like >> >> Cluster::Broadcast(Cluster::WORKERS, event Foo(42)); >> >> This would replace a ton of code that currently uses things like worker2manager_events+manager2worker_events+ at if ( Cluster::local_node_type() == Cluster::MANAGER ) > > The successor to &synchronized was primarily intended to be the new data store stuff, so is there a way to map what you need onto that functionality? Or can you elaborate on an example where you think this new broadcast pattern is a better way to replace &synchronized than using a data store? > > - Jon I think a shared data store would work for most of the use cases where people are messing with worker2manager_events. If all the cases of people using worker2manager_events+manager2worker_events to mimic broadcast functionality are really just doing so to update data then it does make sense to just replace all of that with a new data store. How would something like policy/protocols/ssl/validate-certs.bro look with intermediate_cache as a data store? ? Justin Azoff From asharma at lbl.gov Thu Nov 2 11:37:46 2017 From: asharma at lbl.gov (Aashish Sharma) Date: Thu, 2 Nov 2017 11:37:46 -0700 Subject: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b) In-Reply-To: References: <201710271803.v9RI3oSQ001411@bro-ids.icir.org> <20171031181607.GB26741@icir.org> <20171101212333.GA29502@icir.org> <2336CB3B-DAA8-474F-83FC-252A59598DC4@illinois.edu> Message-ID: <20171102183744.GE33335@MacPro-2331.local> My view: I have again and again encountered 4 types cases while doing script/pkg work: 1) manager2worker: Input-framework reads external data and all workers need to see it. examples: intel-framework, 2) worker2manager: workers see something report to manager, manager keeps aggregated counts to make decisions example: scan-detection 3) worker2manager2all-workers: workers see something, send to manager, manager distributes to all workers example: tracking clicked URLs from extracted from email Basically, Bro has two kinds of heuristic needs a) Cooked data analysis and corelations - cooked data is the data which ends up in logs - basically the entire 'protocol record' example c$http or c$smtp - these are majority. Cooked data processing functionality can be also interpreted, for simplicity) as : tail -f blah.log | ./python-script but inside bro. b) Raw or derived data - which you need to extract from traffic with a defined policy of your own (example - extracted URLs from email tapping into mime_data_all event) or extracting mac addresses from router advertisements/solicitation events or something which is not yet in ::Info record or a new 'thing' - this should be rare and few use cases over time. So in short, give me reliable events which are simply tail -f log functionality on a data/processing node. It will reduce the number of syncronization needs by order of magnitude(s). for (b) - raw or derived data, we can keep complexities of broker stores and syncs. etc. but I have hopes that a refined raw data could become its own log easily and be processed as cooked data. So a lot of data centrality issues related to cluster can go away with data note which can handle a lot of cooked data related stuff for (1), (2) and in somecases (3). Now, while Justins' multiple data nodes idea has specticular merits, I am not much fan of it. Reason being having multiple data-notes results in same sets of problems - syncronization, latencies, mess of data2worker, worker2data events etc etc. I'd love to keep things rather simple. Cooked data goes to one (or more) datanodes (datastores). Just replicate for relibaility rather then pick and choose what goes where. Just picking up some things: > > In the case of broadcasting from a worker to all other workers, the reason why you relay via another node is only because workers are not connected to each other? Do we know that a fully-connected cluster is a bad idea? i.e. why not have a worker able to broadcast directly to all other workers if that?s what is needed? > > Mostly so that workers don't end up spending all their time sending out messages when they should be analyzing packets. Yes, Also, I have seen this can case broadcast stroms. Thats why I have always used manager as a central judge on what goes. See, often same data is seen by all workers. so if manager is smart, it can just send first instance to workers and all other workers can stop announcing further. Let me explain: - I block a scanner on 3 connections. - 3 workers see a connection each - they each report to manager - manager says "yep scanner" sends note to all workers saying traffic from this IP is now uninteresting stop reporting. - lets say 50 workers - total commnication events = 53 If all workers send data to all workers a scanner hitting 65,000 hosts will be a mess inside cluster. esp when scanners are hitting in ms and not seconds. Similar to this is another case. lets say - I read 1 million blacklisted IPs from a file on manager. - manager sends 1 million X 50 events ( to 50 workers) - each worker needs to report if a blacklisted IP has touched network - now imagine, if we want to keep a count of how many unique local IPs has each of these blacklisted IPs touched - and at what rate and when was first contact and when was last contact. (btw, I have a working script for this - so whatever new broker does, it needs to be able to give me this functionality) Here is a sample log: #fields ts ipaddr ls days_seen first_seen last_seen active_for last_active hosts total_conns source 1509606970.541130 185.87.185.45 Blacklist::ONGOING 3 1508782518.636892 1509462618.466469 07-20:55:00 01-16:05:52 20 24 TOR 1509606980.542115 46.166.162.53 Blacklist::ONGOING 3 1508472908.494320 1509165782.304233 08-00:27:54 05-02:33:18 7 9 TOR 1509607040.546524 77.161.34.157 Blacklist::ONGOING 3 1508750181.852639 1509481945.439893 08-11:16:04 01-10:44:55 7 9 TOR 1509607050.546742 45.79.167.181 Blacklist::ONGOING 4 1508440578.524377 1508902636.365934 05-08:20:58 08-03:40:14 66 818 TOR 1509607070.547143 192.36.27.7 Blacklist::ONGOING 6 1508545003.176139 1509498930.174750 11-00:58:47 01-06:02:20 30 33 TOR 1509607070.547143 79.137.80.94 Blacklist::ONGOING 6 1508606207.881810 1509423624.519253 09-11:03:37 02-02:57:26 15 16 TOR Aashish Aashish On Thu, Nov 02, 2017 at 05:58:31PM +0000, Azoff, Justin S wrote: > > > On Nov 2, 2017, at 1:22 PM, Siwek, Jon wrote: > > > > > >> On Nov 1, 2017, at 6:11 PM, Azoff, Justin S wrote: > >> > >> - a bif/function for efficiently broadcasting an event to all other workers (or data nodes) > >> - If the current node is a data node, just send it to all workers > >> - otherwise, round robin the event to a data node and have it send it to all workers minus the current node. > > > > In the case of broadcasting from a worker to all other workers, the reason why you relay via another node is only because workers are not connected to each other? Do we know that a fully-connected cluster is a bad idea? i.e. why not have a worker able to broadcast directly to all other workers if that?s what is needed? > > Mostly so that workers don't end up spending all their time sending out messages when they should be analyzing packets. > > >> If &synchronized is going away script writers should be able to broadcast an event to all workers by doing something like > >> > >> Cluster::Broadcast(Cluster::WORKERS, event Foo(42)); > >> > >> This would replace a ton of code that currently uses things like worker2manager_events+manager2worker_events+ at if ( Cluster::local_node_type() == Cluster::MANAGER ) > > > > The successor to &synchronized was primarily intended to be the new data store stuff, so is there a way to map what you need onto that functionality? Or can you elaborate on an example where you think this new broadcast pattern is a better way to replace &synchronized than using a data store? > > > > - Jon > > I think a shared data store would work for most of the use cases where people are messing with worker2manager_events. > > If all the cases of people using worker2manager_events+manager2worker_events to mimic broadcast functionality are really just > doing so to update data then it does make sense to just replace all of that with a new data store. > > How would something like policy/protocols/ssl/validate-certs.bro look with intermediate_cache as a data store? > > > ? > Justin Azoff > > > _______________________________________________ > bro-dev mailing list > bro-dev at bro.org > http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev From aaron.eppert at packetsled.com Thu Nov 2 12:34:24 2017 From: aaron.eppert at packetsled.com (Aaron Eppert) Date: Thu, 2 Nov 2017 20:34:24 +0100 Subject: [Bro-Dev] File Analysis Inconsistencies In-Reply-To: <7A9DB74F-11EC-41FD-8952-24F4843191E6@illinois.edu> References: <42A432BE-2325-4305-885F-1B3197BD7EBF@illinois.edu> <7A9DB74F-11EC-41FD-8952-24F4843191E6@illinois.edu> Message-ID: Justin, Thank you. I peeled the egg off my face and updated the github code accordingly. However, I have run into an additional interesting tidbit if I use event file_sniff to attach an analyzer or Files::register_for_mime_types, neither will generate a files.log entry when I am not running a PCAP from the command line. So any kind of normal network processing and/or playing a pcap over a listening interface via tcpreplay will cause the analyzer to fire properly. However, if I attach the EXTRACT analyzer, all processing goes as expected. What nuance could I be missing here? The plugin effectively initializes like the existing file analysis analyzers, save it?s a plugin. Is there a hard and fast requirement that ignorer for the file analysis framework to work properly the file has to be explicitly extracted? Most additional analysis, I would assume not use disk resources to extract them and, instead, observe what I need and move on. Any insight from anyone would be greatly appreciated. Thank you, Aaron On October 13, 2017 at 11:32:46 AM, Azoff, Justin S (jazoff at illinois.edu) wrote: > On Oct 13, 2017, at 11:01 AM, Aaron Eppert wrote: > > Justin, > > Indeed, cutting new territory is always interesting. As for the code, > > https://github.com/aeppert/test_file_analyzer > > > File I am using for this case: > https://www.bro.org/static/exchange-2013/faf-exercise.pcap > > `bro -C -r faf-exercise.pcap` after building and installing the plugin. > > My suspicion is it?s either unbelievably trivial and I keep missing it because I am the only one staring at it, or it?s a rather deep rabbit hole. > > Aaron Thanks for putting that together.. now I see what you mean. Building the plugin with ASAN confirms it is trying to access uninitialized memory: $ /usr/local/bro/bin/bro -C -r faf-exercise.pcap TEST::Finalize total_len = 65960 BUFFER 00 ea 09 00 50 61 00 00 80 eb 09 00 50 61 00 00 ================================================================= ==93650==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000bf5d08 at pc 0x00010b19d39b bp 0x7fff57829e10 sp 0x7fff57829e08 READ of size 1 at 0x603000bf5d08 thread T0 #0 0x10b19d39a in print_bytes(std::__1::basic_ostream >&, char const*, unsigned char const*, unsigned long, bool) TEST.cc:21 #1 0x10b19e43b in file_analysis::TEST::Finalize() TEST.cc:87 ... The problem is this line: bufv->push_back(data); That's only pushing the first char of the buffer onto the vector, not the entire buffer. If you print out bufv->size() you'll see that it is not what it should be. If you apply this change it will run without crashing and I believe give the expected output: diff --git a/src/TEST.cc b/src/TEST.cc index 8d78ef2..56d0a83 100644 --- a/src/TEST.cc +++ b/src/TEST.cc @@ -56,7 +56,7 @@ bool TEST::DeliverStream(const u_char* data, uint64 len) } if ( total_len < TEST_MAX_BUFFER) { - bufv->push_back(data); + print_bytes(std::cout, "BUFFER", data, len); total_len += len; } @@ -84,7 +84,7 @@ void TEST::Finalize() //auto pos = std::find(bufv->begin(), bufv->end(), (unsigned char *)"Exif"); //std::cout << "Offset = " << std::distance( bufv->begin(), pos ) << std::endl; - print_bytes(std::cout, "BUFFER", (const u_char *)&bufv[0], total_len); + //print_bytes(std::cout, "BUFFER", (const u_char *)&bufv[0], total_len); val_list* vl = new val_list(); vl->append(GetFile()->GetVal()->Ref()); I don't know off the top of my head the right way to extend a c++ vector by a c buffer, but doing so should fix things. ? Justin Azoff -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.icsi.berkeley.edu/pipermail/bro-dev/attachments/20171102/68300617/attachment.html From jazoff at illinois.edu Thu Nov 2 12:35:41 2017 From: jazoff at illinois.edu (Azoff, Justin S) Date: Thu, 2 Nov 2017 19:35:41 +0000 Subject: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b) In-Reply-To: <20171102183744.GE33335@MacPro-2331.local> References: <201710271803.v9RI3oSQ001411@bro-ids.icir.org> <20171031181607.GB26741@icir.org> <20171101212333.GA29502@icir.org> <2336CB3B-DAA8-474F-83FC-252A59598DC4@illinois.edu> <20171102183744.GE33335@MacPro-2331.local> Message-ID: > On Nov 2, 2017, at 2:37 PM, Aashish Sharma wrote: > > > > Now, while Justins' multiple data nodes idea has specticular merits, I am not much fan of it. Reason being having multiple data-notes results in same sets of problems It does not have the same problems.. It may have different problems that I haven't thought of yet, but it doesn't have the same problems. > syncronization, What synchronization problems? > latencies Adding multiple data nodes will reduce the load on each node and lower overall latencies. > mess of data2worker, worker2data events etc etc you're projecting the current mess of worker2manager_events and manager2worker_events onto what I am trying to replace them with. Having worker2manager_events and @if ( Cluster::is_enabled() && Cluster::local_node_type() != Cluster::MANAGER ) all over the place exists because bro doesn't have higher level methods for distributing data and events across the cluster. I am not proposing replacing that with worker2datanode_events and @if ( Cluster::is_enabled() && Cluster::local_node_type() != Cluster::DATANODE ) I'm proposing getting rid of that sort of thing entirely. No '@if cluster'. no 'redef worker2manager_events'. All gone. > I'd love to keep things rather simple. Cooked data goes to one (or more) datanodes (datastores). Just replicate for relibaility rather then pick and choose what goes where. Then clusters will just change from having an overloaded manager process that is falling under the load to 2 data nodes that are both failing. This is just renaming the current bottlenecks and is not a solution. I implemented a multi data node cluster back in March on top of topic/mfischer/broker-integration . Porting my scan.bro from the manager2worker_events stuff to sending events directly to one of N datanodes was: Remove: redef Cluster::worker2manager_events ... @if (Cluster ... event Scan::scan_attempt(scanner, attempt); Add: local args = Broker::event_args(Scan::scan_attempt, scanner, attempt); Cluster::send_event_hashed(scanner, args); Other than having that wrapped in a single function, it doesn't get any easier than that. ? Justin Azoff From jsiwek at illinois.edu Thu Nov 2 14:21:23 2017 From: jsiwek at illinois.edu (Siwek, Jon) Date: Thu, 2 Nov 2017 21:21:23 +0000 Subject: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b) In-Reply-To: References: <201710271803.v9RI3oSQ001411@bro-ids.icir.org> <20171031181607.GB26741@icir.org> <20171101212333.GA29502@icir.org> <2336CB3B-DAA8-474F-83FC-252A59598DC4@illinois.edu> Message-ID: > On Nov 2, 2017, at 12:58 PM, Azoff, Justin S wrote: > > >> On Nov 2, 2017, at 1:22 PM, Siwek, Jon wrote: >> >> >>> On Nov 1, 2017, at 6:11 PM, Azoff, Justin S wrote: >>> >>> - a bif/function for efficiently broadcasting an event to all other workers (or data nodes) >>> - If the current node is a data node, just send it to all workers >>> - otherwise, round robin the event to a data node and have it send it to all workers minus the current node. >> >> In the case of broadcasting from a worker to all other workers, the reason why you relay via another node is only because workers are not connected to each other? Do we know that a fully-connected cluster is a bad idea? i.e. why not have a worker able to broadcast directly to all other workers if that?s what is needed? > > Mostly so that workers don't end up spending all their time sending out messages when they should be analyzing packets. Ok, I get what you want to avoid, though could be interesting to actually have a fully-connected cluster in order to collect performance data on each comm. pattern and see how significant the difference is for a variety of use-cases. Do you have a particular example you can give where you?d use this BIF/function to relay a broadcast from a worker to all other workers via a proxy? > How would something like policy/protocols/ssl/validate-certs.bro look with intermediate_cache as a data store? global intermediate_store: Cluster::StoreInfo; event bro_init() { intermediate_store = Cluster::create_store(?ssl/validate-certs/intermediate_store"); } And then port the rest of that script to use broker data store api (get/put/exists calls) to access that store. - Jon From jsiwek at illinois.edu Thu Nov 2 14:54:17 2017 From: jsiwek at illinois.edu (Siwek, Jon) Date: Thu, 2 Nov 2017 21:54:17 +0000 Subject: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b) In-Reply-To: <20171102183744.GE33335@MacPro-2331.local> References: <201710271803.v9RI3oSQ001411@bro-ids.icir.org> <20171031181607.GB26741@icir.org> <20171101212333.GA29502@icir.org> <2336CB3B-DAA8-474F-83FC-252A59598DC4@illinois.edu> <20171102183744.GE33335@MacPro-2331.local> Message-ID: <3456002A-BA7E-4749-9ABB-0C7A03E9F5EC@illinois.edu> > On Nov 2, 2017, at 1:37 PM, Aashish Sharma wrote: > >>> In the case of broadcasting from a worker to all other workers, the reason why you relay via another node is only because workers are not connected to each other? Do we know that a fully-connected cluster is a bad idea? i.e. why not have a worker able to broadcast directly to all other workers if that?s what is needed? >> >> Mostly so that workers don't end up spending all their time sending out messages when they should be analyzing packets. > > Yes, Also, I have seen this can case broadcast stroms. Thats why I have always > used manager as a central judge on what goes. See, often same data is seen by > all workers. so if manager is smart, it can just send first instance to workers > and all other workers can stop announcing further. > > Let me explain: > > - I block a scanner on 3 connections. > - 3 workers see a connection each - they each report to manager > - manager says "yep scanner" sends note to all workers saying traffic from this > IP is now uninteresting stop reporting. > - lets say 50 workers > - total commnication events = 53 > > If all workers send data to all workers a scanner hitting 65,000 hosts will be a > mess inside cluster. esp when scanners are hitting in ms and not seconds. Thanks, though I?m not sure this scenario maps well to this particular point. E.g. my impression is Justin wants a single BIF/function that can send one event from a worker to a proxy and have the proxy purely relay it to all other workers without doing anything else. So it?s solely taking the cost of sending N messages from a worker and offloading that burden to a different node. I think your example differs because there is actually an additional task/logic being performed on the middleman that ends up reducing comm/processing requirements. i.e. it?s not just a pure relay of messages. Or maybe I?m just not understanding what anybody wants :) - Jon From jazoff at illinois.edu Thu Nov 2 15:33:46 2017 From: jazoff at illinois.edu (Azoff, Justin S) Date: Thu, 2 Nov 2017 22:33:46 +0000 Subject: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b) In-Reply-To: References: <201710271803.v9RI3oSQ001411@bro-ids.icir.org> <20171031181607.GB26741@icir.org> <20171101212333.GA29502@icir.org> <2336CB3B-DAA8-474F-83FC-252A59598DC4@illinois.edu> Message-ID: <37F7F2E4-DDE0-46EA-BE4E-395A92ADB4F4@illinois.edu> > On Nov 2, 2017, at 5:21 PM, Siwek, Jon wrote: >> >> Mostly so that workers don't end up spending all their time sending out messages when they should be analyzing packets. > > Ok, I get what you want to avoid, though could be interesting to actually have a fully-connected cluster in order to collect performance data on each comm. pattern and see how significant the difference is for a variety of use-cases. > > Do you have a particular example you can give where you?d use this BIF/function to relay a broadcast from a worker to all other workers via a proxy? Scripts like what validate-certs does to broadcast the presence of a new intermediate cert to the other nodes in the cluster. That script does have the added optimization on the manager side for only broadcasting the value once if a new cert is seen by two workers at the same time, so maybe it's not the best example for a broadcast. With explicit event destinations and load balancing across data nodes that script could look something like function add_to_cache(key: string, value: vector of opaque of x509) { # could still do @if ( Cluster::is_enabled() ), but in a standalone setup we are the data # node, so this would just short circuit to raise the event locally. I'd rather broker do # the @if internally than have every script have to have two implementations. Broker::publish_hrw(Cluster::data_pool, key, SSL::new_intermediate, key, value); } event SSL::new_intermediate(key: string, value: vector of opaque of x509) { if ( key in intermediate_cache ) return; intermediate_cache[key] = value; # in a standalone setup this would just be a NOOP Broker::broadcast(Cluster::worker_pool, SSL:: intermediate_add, key, value); } event SSL::intermediate_add(key: string, value: vector of opaque of x509) { intermediate_cache[key] = value; } Without the added optimization you'd just have function add_to_cache(key: string, value: vector of opaque of x509) { intermediate_cache[key] = value; # in a standalone setup this would just be a NOOP Broker::broadcast(Cluster::worker_pool, SSL:: intermediate_add, key, value); } event SSL::intermediate_add(key: string, value: vector of opaque of x509) { intermediate_cache[key] = value; } The optimization could be built into broker though, something like Broker::broadcast_magic_once_whatever(Cluster::worker_pool, key, SSL:: intermediate_add, key, value); That would hash the key, send it to a data node, then have the data node broadcast the event while adding key to a 'recently broadcasted keys' table that only needs to buffer for 10s or so. This would enable you to efficiently broadcast an event (once) across all workers with a single line of code. In either case all of the manager2worker_events worker2manager_events @if ( Cluster::is_enabled() && Cluster::local_node_type() != Cluster::MANAGER ) Is no longer needed. I guess my whole point with all of this is that if the intent of a script is that an event should be seen on all workers, the script should look something like Broker::broadcast(Cluster::worker_pool, SSL:: intermediate_add, key, value); and not have a bunch of redefs and @ifs so that the script can eventually have event SSL::new_intermediate(key, value); >> How would something like policy/protocols/ssl/validate-certs.bro look with intermediate_cache as a data store? > > global intermediate_store: Cluster::StoreInfo; > > event bro_init() > { > intermediate_store = Cluster::create_store(?ssl/validate-certs/intermediate_store"); > } > > And then port the rest of that script to use broker data store api (get/put/exists calls) to access that store. > > - Jon Does that have the same performance profile as the current method? if (issuer in intermediate_cache) vs Broker::get(intermediate_cache, issuer) ? Justin Azoff From jazoff at illinois.edu Thu Nov 2 15:58:56 2017 From: jazoff at illinois.edu (Azoff, Justin S) Date: Thu, 2 Nov 2017 22:58:56 +0000 Subject: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b) In-Reply-To: <3456002A-BA7E-4749-9ABB-0C7A03E9F5EC@illinois.edu> References: <201710271803.v9RI3oSQ001411@bro-ids.icir.org> <20171031181607.GB26741@icir.org> <20171101212333.GA29502@icir.org> <2336CB3B-DAA8-474F-83FC-252A59598DC4@illinois.edu> <20171102183744.GE33335@MacPro-2331.local> <3456002A-BA7E-4749-9ABB-0C7A03E9F5EC@illinois.edu> Message-ID: > On Nov 2, 2017, at 5:54 PM, Siwek, Jon wrote: > > Thanks, though I?m not sure this scenario maps well to this particular point. E.g. my impression is Justin wants a single BIF/function that can send one event from a worker to a proxy and have the proxy purely relay it to all other workers without doing anything else. So it?s solely taking the cost of sending N messages from a worker and offloading that burden to a different node. > > I think your example differs because there is actually an additional task/logic being performed on the middleman that ends up reducing comm/processing requirements. i.e. it?s not just a pure relay of messages. > > Or maybe I?m just not understanding what anybody wants :) > > - Jon I think you're understanding it perfectly :-) You're right that it's often not a pure relay, but it's often the same messaging pattern of "send to all other workers, but deduplicate it first to avoid sending the same message twice back to back". For an example of a purely broadcast use case, see scripts/base/frameworks/intel/cluster.bro You can see the crazy amount of complexity around the Intel::cluster_new_item event. Ultimately what I'm wanting are the messaging patterns to be implemented in something bro ships, so that script writers don't have to implement message broadcasting and relaying themselves - and end up doing it less efficiently than bro can do it internally. Requiring script writers to write cluster and non-cluster versions peppered with @if (Cluster means that bro is not abstracting things at the right level anymore. Maybe with broker data stores there won't be much use for things like event broadcasting, but I feel like anything that does need to be broadcasted should be using an explicit Broadcast function, and not things like manager2worker_events. ? Justin Azoff From jsiwek at illinois.edu Thu Nov 2 19:15:38 2017 From: jsiwek at illinois.edu (Siwek, Jon) Date: Fri, 3 Nov 2017 02:15:38 +0000 Subject: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b) In-Reply-To: <37F7F2E4-DDE0-46EA-BE4E-395A92ADB4F4@illinois.edu> References: <201710271803.v9RI3oSQ001411@bro-ids.icir.org> <20171031181607.GB26741@icir.org> <20171101212333.GA29502@icir.org> <2336CB3B-DAA8-474F-83FC-252A59598DC4@illinois.edu> <37F7F2E4-DDE0-46EA-BE4E-395A92ADB4F4@illinois.edu> Message-ID: <6C470E4C-1035-420B-B92C-B9F225E10B26@illinois.edu> > On Nov 2, 2017, at 5:33 PM, Azoff, Justin S wrote: > > The optimization could be built into broker though, something like > > Broker::broadcast_magic_once_whatever(Cluster::worker_pool, key, SSL:: intermediate_add, key, value); > > That would hash the key, send it to a data node, then have the data node broadcast the > event The first thing I?d suggest is a new internal broker message type that can publish a message via a topic and then on any receiving nodes re-publish via a second topic. In this way, we could distribute to all workers via a single proxy/data node as: local one_proxy_topic = Cluster::rr_topic(Cluster::proxy_pool, ?ssl/intermediate_add?)); local e = Broker::make_event(SSL::intermediate_add, key, value); Broker::relay(one_proxy_topic, Cluster::worker_topic, e); Or potentially compressed into one call: Cluster::relay(first_topic, second_topic, rr_key, event, varargs...); > while adding key to a 'recently broadcasted keys' table that only needs to buffer for 10s or so. > > This would enable you to efficiently broadcast an event (once) across all workers with a single line of code. I?m not sure about this part. Is that just for throttling potential duplicates? Is that something you want generally or just for the this particular example? I?m thinking maybe it can wait until we know we have several places where it?s actually needed/used before compressing that pattern into a single BIF/function. > I guess my whole point with all of this is that if the intent of a script is that an event should be seen on all workers, the script should look something like > > Broker::broadcast(Cluster::worker_pool, SSL:: intermediate_add, key, value); > > and not have a bunch of redefs and @ifs so that the script can eventually have > > event SSL::new_intermediate(key, value); Yeah, minimizing the need for people to have to constantly implement various code paths that depend on cluster/node-type is a good goal and what I?m also aiming at. >>> How would something like policy/protocols/ssl/validate-certs.bro look with intermediate_cache as a data store? >> >> global intermediate_store: Cluster::StoreInfo; >> >> event bro_init() >> { >> intermediate_store = Cluster::create_store(?ssl/validate-certs/intermediate_store"); >> } >> >> And then port the rest of that script to use broker data store api (get/put/exists calls) to access that store. >> >> - Jon > > Does that have the same performance profile as the current method? > > if (issuer in intermediate_cache) > > vs > > Broker::get(intermediate_cache, issuer) Theoretically, they?re both in-memory hash-table lookups, though the implementations are obviously very different and I don?t know how they compare in reality. I think it?s true that many scripts could be ported to use either data stores or just explicitly exchange events. Probably the preference to use a data store would be for cases where it may be useful to keep persistent data across crashes/restarts. - Jon From jan.grashoefer at gmail.com Fri Nov 3 03:51:55 2017 From: jan.grashoefer at gmail.com (=?UTF-8?Q?Jan_Grash=c3=b6fer?=) Date: Fri, 3 Nov 2017 11:51:55 +0100 Subject: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b) In-Reply-To: References: <201710271803.v9RI3oSQ001411@bro-ids.icir.org> <20171031181607.GB26741@icir.org> <20171101212333.GA29502@icir.org> <2336CB3B-DAA8-474F-83FC-252A59598DC4@illinois.edu> <20171102183744.GE33335@MacPro-2331.local> <3456002A-BA7E-4749-9ABB-0C7A03E9F5EC@illinois.edu> Message-ID: <80b63959-df91-2beb-76af-ea8d67a253ee@gmail.com> On 02/11/17 23:58, Azoff, Justin S wrote: > For an example of a purely broadcast use case, see > > scripts/base/frameworks/intel/cluster.bro > > You can see the crazy amount of complexity around the Intel::cluster_new_item event. That's right! Took me some time to figure out how data should be distributed. So, I am following this thread trying to keep up with the development. Right now I don't have new ideas to contribute but as the intel framework was mentioned multiple times as an example, I thought I might sketch its communication behavior, so that we have a more complete view of that use case. Let's assume in the beginning there is only the manager. The manager reads in the intel file and creates his "in-memory database", a quite complex table (DataStore), as well as a data structure that contains only the indicators for matching on the workers (MinDataStore). Now, when a worker connects, he receives the current MinDataStore, sent using send_id. (Side note: I am planning to replace the sets used in the MinDataStore by Cuckoo Filters. Not sure how serialization etc. will work out using broker but if I remember correctly there is a temporary solution for now.) If the worker detects a match, he triggers match_no_items on the manager, who generates the hit by combining the seen data of the worker and the meta data of the DataStore. At this point, if the manager functionality is distributed across multiple data nodes, we have to make sure, that every data node has the right part of the DataStore to deal with the incoming hit. One could keep the complete DataStore on every data node but I think that would lead to another scheme in which a subset of workers send all their requests to a specific data node, i.e. each data node serves a part of the cluster. Back to the current implementation. So far not that complex but there are two more cases to deal with: Inserting new intel items and removing items. A new item can be inserted on the manager or on a worker. As a new item might be just new meta data for an already existing indicator (no update of the MinDataStore needed), the manager is the only one who can handle the insert. So if inserted on a worker, the worker triggers a cluster_new_item event on the manager, who proceeds like he inserted the item. Finally, the manager only triggers cluster_new_item on the workers if the inserted item was a new indicator that has to be added to the worker's MinDataStores. Some of the complexity here is due to the fact that the same event, cluster_new_item, is used for communication in both directions (worker2manager and manager2worker). The removal of items works more or less the same with the only difference that for each direction there is a specific event (remove_item and purge_item). Long story short: I think the built in distribution across multiple data nodes you discussed is a great idea. The only thing to keep in mind would be a suitable way of "initializing" the data nodes with the corresponding subset of data they need to handle. I guess in case of the intel framework the manager will still handle reading the intel files and might make use of the same mechanisms the workers use to distribute the ingested data to the data nodes. The only thing I am not sure about is how we can/should handle dynamic adding and removing of the data nodes. And just to avoid misunderstandings: We won't be able to get rid of the @if (Cluster::local_node_type() != Cluster::MANAGER/DATANODE) statements completely as different node types have different functionality. It's just about the communication API, right? I hope this helps when thinking about the API design :) Jan From jsiwek at illinois.edu Fri Nov 3 08:39:40 2017 From: jsiwek at illinois.edu (Siwek, Jon) Date: Fri, 3 Nov 2017 15:39:40 +0000 Subject: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b) In-Reply-To: <80b63959-df91-2beb-76af-ea8d67a253ee@gmail.com> References: <201710271803.v9RI3oSQ001411@bro-ids.icir.org> <20171031181607.GB26741@icir.org> <20171101212333.GA29502@icir.org> <2336CB3B-DAA8-474F-83FC-252A59598DC4@illinois.edu> <20171102183744.GE33335@MacPro-2331.local> <3456002A-BA7E-4749-9ABB-0C7A03E9F5EC@illinois.edu> <80b63959-df91-2beb-76af-ea8d67a253ee@gmail.com> Message-ID: <44D83140-9877-4B6C-8910-FB6C1D017BAA@illinois.edu> > On Nov 3, 2017, at 5:51 AM, Jan Grash?fer wrote: > > And just to avoid misunderstandings: We won't be able to get rid of the > @if (Cluster::local_node_type() != Cluster::MANAGER/DATANODE) > statements completely as different node types have different > functionality. It's just about the communication API, right? The ability to write that code doesn?t go away, I think it?s just in some places we may have the ability to do something else that may be easier to understand and/or less busy-work for script-writers to implement. As I port the handful of scripts that come with Bro, I?m generally hesitant to radically change/reorganize the way they work. For obvious comm. patterns that are repeated and can be replaced with something simpler, I?d do that, but for complex scripts I?d likely try to just do the simplest translation that gets it working. I also want to make sure there?s a good foundation for people to then make further/larger changes and it sounds like there will be: it?s mostly a matter of changing the cluster layout a bit and maybe giving Justin a few more functions related to message patterns/distribution. Once the thread slows, I?ll post a concise summary of the particular changes that I think are needed. - Jon From jazoff at illinois.edu Fri Nov 3 10:07:09 2017 From: jazoff at illinois.edu (Azoff, Justin S) Date: Fri, 3 Nov 2017 17:07:09 +0000 Subject: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b) In-Reply-To: <80b63959-df91-2beb-76af-ea8d67a253ee@gmail.com> References: <201710271803.v9RI3oSQ001411@bro-ids.icir.org> <20171031181607.GB26741@icir.org> <20171101212333.GA29502@icir.org> <2336CB3B-DAA8-474F-83FC-252A59598DC4@illinois.edu> <20171102183744.GE33335@MacPro-2331.local> <3456002A-BA7E-4749-9ABB-0C7A03E9F5EC@illinois.edu> <80b63959-df91-2beb-76af-ea8d67a253ee@gmail.com> Message-ID: > On Nov 3, 2017, at 6:51 AM, Jan Grash?fer wrote: > > At this point, if the manager functionality is distributed across > multiple data nodes, we have to make sure, that every data node has the > right part of the DataStore to deal with the incoming hit. One could > keep the complete DataStore on every data node but I think that would > lead to another scheme in which a subset of workers send all their > requests to a specific data node, i.e. each data node serves a part of > the cluster. Yeah, this is where the HRW(hashing) vs RR(round robin) pool distribution methods come in. If all data nodes had a full copy of the data store, then either dsitribution method would work. Partitioning the intel data set is a little tricky since it supports subnets and hashing 10.10.0.0/16 and 10.10.10.10 won't necessarily give you the same node. Maybe subnets need to exist on all nodes but everything else can be partitioned? There would also need to be a method for re-distributing the data if the cluster configuration changes due to nodes being added or removed. 'Each data node serving a part of a cluster' is kind of like what we have now with proxies, but that is statically configured and has no support for failover. I've seen cluster setups where there are 4 worker boxes and run one proxy on each box. The problem is if one box down, 1/4 of the workers on the remaining 3 boxes are configured to use a proxy that no longer exists. So minimally just having a copy of the data in another process and using RR would be an improvement. There may be an issue with scaling out data notes to 8+ processes for things like scan detection and sumstats, if those 8 data nodes would also need to have a full copy of the intel data in memory. I don't know how much memory a large intel data set is inside a running bro process though. Things like scan detection,sumstats,known hosts/ports/services/certs are a lot easier to partition because by definition they are keyed on something. ? Justin Azoff From jan.grashoefer at gmail.com Fri Nov 3 12:13:15 2017 From: jan.grashoefer at gmail.com (=?UTF-8?Q?Jan_Grash=c3=b6fer?=) Date: Fri, 3 Nov 2017 20:13:15 +0100 Subject: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b) In-Reply-To: References: <201710271803.v9RI3oSQ001411@bro-ids.icir.org> <20171031181607.GB26741@icir.org> <20171101212333.GA29502@icir.org> <2336CB3B-DAA8-474F-83FC-252A59598DC4@illinois.edu> <20171102183744.GE33335@MacPro-2331.local> <3456002A-BA7E-4749-9ABB-0C7A03E9F5EC@illinois.edu> <80b63959-df91-2beb-76af-ea8d67a253ee@gmail.com> Message-ID: On 03/11/17 18:07, Azoff, Justin S wrote:> Partitioning the intel data set is a little tricky since it supports subnets and hashing 10.10.0.0/16 > and 10.10.10.10 won't necessarily give you the same node. Maybe subnets need to exist on all > nodes but everything else can be partitioned? Good point! Subnets are stored kind of separate to allow prefix matches anyway. However, I am a bit hesitant as it would become a quite complex setup. > There would also need to be a method for > re-distributing the data if the cluster configuration changes due to nodes being added or removed. Right, that's exactly what I was thinking of. I guess this applies also to other use cases which will use HRW. I am just not sure whether dynamic layout changes are out of scope at the moment... > 'Each data node serving a part of a cluster' is kind of like what we have now with proxies, > but that is statically configured and has no support for failover. I've seen cluster setups where > there are 4 worker boxes and run one proxy on each box. The problem is if one box down, > 1/4 of the workers on the remaining 3 boxes are configured to use a proxy that no longer exists. > > So minimally just having a copy of the data in another process and using RR would be an improvement. > > There may be an issue with scaling out data notes to 8+ processes for things like scan detection and sumstats, > if those 8 data nodes would also need to have a full copy of the intel data in memory. I don't know how much > memory a large intel data set is inside a running bro process though. Fully agreed! In that case it might be nice if one can define separate special purpose data nodes, e.g. "intel data nodes". But, I am not sure whether this is a good idea as this might lead to complex cluster definitions and poor usability as users need to know a bit about how the underlying mechanisms work. On the other hand this would theoretically allow to completely decouple the intel data store (e.g. interface a "real" database with some pybroker-scripts). Jan From jsiwek at illinois.edu Fri Nov 3 12:54:12 2017 From: jsiwek at illinois.edu (Siwek, Jon) Date: Fri, 3 Nov 2017 19:54:12 +0000 Subject: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b) In-Reply-To: References: <201710271803.v9RI3oSQ001411@bro-ids.icir.org> <20171031181607.GB26741@icir.org> <20171101212333.GA29502@icir.org> <2336CB3B-DAA8-474F-83FC-252A59598DC4@illinois.edu> <20171102183744.GE33335@MacPro-2331.local> <3456002A-BA7E-4749-9ABB-0C7A03E9F5EC@illinois.edu> <80b63959-df91-2beb-76af-ea8d67a253ee@gmail.com> Message-ID: <7805720E-20B9-4D67-AEC7-8417B6CCAE9A@illinois.edu> > On Nov 3, 2017, at 2:13 PM, Jan Grash?fer wrote: > > Fully agreed! In that case it might be nice if one can define separate > special purpose data nodes, e.g. "intel data nodes". But, I am not sure > whether this is a good idea as this might lead to complex cluster > definitions and poor usability as users need to know a bit about how the > underlying mechanisms work. I had a similar thought, but also not sure if it?s a good idea. Example node.cfg: [data-1] type = data pools = Intel::pool [data-2] type = data pools = Intel::pool [data-3] type = data [data-4] type = data So there would be two pools here: Cluster::data_pool which is already predefined by the cluster framework (and consists of all data nodes that have not been specifically assigned to other pools) and Intel::pool which is defined/registered by the intel framework. Then there?s some magic that makes broctl set up those nodes so that they will belong to any pools listed in the config file and the cluster framework will manage it from there. So this gives users more opportunity to customize, but a problem is it?s hard to say whether the default config file will end up doing something sane for all cases or if you end up with script-writers having more complicated installation instructions like ?you should definitely change your node.cfg and don?t scale this pool out to more than N data nodes?. - Jon From jazoff at illinois.edu Fri Nov 3 13:05:10 2017 From: jazoff at illinois.edu (Azoff, Justin S) Date: Fri, 3 Nov 2017 20:05:10 +0000 Subject: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b) In-Reply-To: References: <201710271803.v9RI3oSQ001411@bro-ids.icir.org> <20171031181607.GB26741@icir.org> <20171101212333.GA29502@icir.org> <2336CB3B-DAA8-474F-83FC-252A59598DC4@illinois.edu> <20171102183744.GE33335@MacPro-2331.local> <3456002A-BA7E-4749-9ABB-0C7A03E9F5EC@illinois.edu> <80b63959-df91-2beb-76af-ea8d67a253ee@gmail.com> Message-ID: On Nov 3, 2017, at 3:13 PM, Jan Grash?fer > wrote: On 03/11/17 18:07, Azoff, Justin S wrote:> Partitioning the intel data set is a little tricky since it supports subnets and hashing 10.10.0.0/16 and 10.10.10.10 won't necessarily give you the same node. Maybe subnets need to exist on all nodes but everything else can be partitioned? Good point! Subnets are stored kind of separate to allow prefix matches anyway. However, I am a bit hesitant as it would become a quite complex setup. Indeed.. replication+load balancing is probably a good enough first step. There would also need to be a method for re-distributing the data if the cluster configuration changes due to nodes being added or removed. Right, that's exactly what I was thinking of. I guess this applies also to other use cases which will use HRW. I am just not sure whether dynamic layout changes are out of scope at the moment... Other use cases are still problematic, but even without replication/redistribution the situation is still greatly improved. Take scan detection for example: With sumstats/scan-ng/simple-scan if the current manager host or process dies, all detection comes to a halt until it is restarted. Once it is restarted, all state is lost so everything starts over from 0. If there were 4 data nodes participating in scan detection, and all 4 die, same result, so this is no better or worse than the current situation. If only one node dies though, only 1/4 of the analysis is affected. The remaining analysis can immediately fail over to the next node. So while it may still have to start from 0, there would only be a small hole in the analysis. For example: The scan threshold is 20 packets. A scan has just started from 10.10.10.10. 10 packets into the scan, the data node that 10.10.10.10 hashes to crashes. HRW now routes data for 10.10.10.10 to another node 30 packets into the scan, the threshold on the new node crosses 20 and a notice is raised. Replication between data nodes could make this even more seamless, but it's not a huge priority, at least for me. My priority is getting the cluster to a point where things don't grind to a halt just because one component is down. Ignoring the worker->logger connections, it would look something like the attached layout.png [cid:4B1B7729-7A8D-483C-83A8-04E1783FE0AE at home] Fully agreed! In that case it might be nice if one can define separate special purpose data nodes, e.g. "intel data nodes". But, I am not sure whether this is a good idea as this might lead to complex cluster definitions and poor usability as users need to know a bit about how the underlying mechanisms work. On the other hand this would theoretically allow to completely decouple the intel data store (e.g. interface a "real" database with some pybroker-scripts). Jan I've been thinking the same thing, but I hope it doesn't come to that. Ideally people will be able to scale their clusters by just increasing the number of data nodes without having to get into the details about what node is doing what. Partitioning the data analysis by task has been suggested.. i.e., one data node for scan detection, one data node for spam detection, one data node for sumstats.. I think this would be very easy to implement, but it doesn't do anything to help scale out those individual tasks once one process can no longer handle the load. You would just end up with something like the scan detection and spam data nodes at 20% cpu and the sumstats node CPU at 100% ? Justin Azoff -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.icsi.berkeley.edu/pipermail/bro-dev/attachments/20171103/4b31b911/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: layout.png Type: image/png Size: 19088 bytes Desc: layout.png Url : http://mailman.icsi.berkeley.edu/pipermail/bro-dev/attachments/20171103/4b31b911/attachment-0001.bin From jan.grashoefer at gmail.com Mon Nov 6 05:18:06 2017 From: jan.grashoefer at gmail.com (=?UTF-8?Q?Jan_Grash=c3=b6fer?=) Date: Mon, 6 Nov 2017 14:18:06 +0100 Subject: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b) In-Reply-To: References: <201710271803.v9RI3oSQ001411@bro-ids.icir.org> <20171031181607.GB26741@icir.org> <20171101212333.GA29502@icir.org> <2336CB3B-DAA8-474F-83FC-252A59598DC4@illinois.edu> <20171102183744.GE33335@MacPro-2331.local> <3456002A-BA7E-4749-9ABB-0C7A03E9F5EC@illinois.edu> <80b63959-df91-2beb-76af-ea8d67a253ee@gmail.com> Message-ID: <546059cf-a97b-2767-1ef0-b7ad3bb46e36@gmail.com> On 03/11/17 21:05, Azoff, Justin S wrote: > I've been thinking the same thing, but I hope it doesn't come to that. Ideally people will be able > to scale their clusters by just increasing the number of data nodes without having to get into > the details about what node is doing what. > > Partitioning the data analysis by task has been suggested.. i.e., one data node for scan detection, > one data node for spam detection, one data node for sumstats.. I think this would be very easy to > implement, but it doesn't do anything to help scale out those individual tasks once one process can > no longer handle the load. You would just end up with something like the scan detection and spam > data nodes at 20% cpu and the sumstats node CPU at 100% I would keep the particular data-services scalable but allow the user to specify their distribution across the data nodes. As Jon already wrote, it could look like this (I added Spam and Scan pools): [data-1] type = data pools = Intel::pool [data-2] type = data pools = Intel::pool, Scan::pool [data-3] type = data pools = Scan::pool, Spam::pool [data-4] type = data pools = Spam:pool However, this approach likely results in confusing config files and, as Jon wrote, it's hard to define a default configuration. In the end this is an optimization problem: How to assign data-services (pools) to data nodes to get the best performance (in terms of speed, memory-usage and reliability)? I guess there are two possible approaches: 1) Let the user do the optimization, i.e. provide a possibility to assign data services to data nodes as described above. 2) Let the developer specify constraints for the data service distribution across data nodes and automatize the optimization. The minimal example would be that for each data service a minimum and maximum or default number of data nodes is specified (e.g. Intel on 1-2 nodes and Scan detection on all available nodes). More complex specifications could require that a data service isn't scheduled on data nodes together with (particular) other services. Another thing that might need to be considered are deep clusters. If I remember correctly, there has been some work on that in context of broker. For a deep cluster there might be even hierarchies of data nodes (e.g. root-intel-nodes managing the whole database and 2nd-level-data-nodes serving as caches for worker-nodes on per site level). Jan From seth at corelight.com Mon Nov 6 06:16:55 2017 From: seth at corelight.com (Seth Hall) Date: Mon, 06 Nov 2017 09:16:55 -0500 Subject: [Bro-Dev] Scientific notation? Message-ID: Right now, Bro will print scientific notation in JSON logs but we've always tended to avoid it in the standard Bro log format. What does everyone think about switching to allow scientific notation in the standard log format? Daniel recently did some exploration of various versions of awk and they all support scientific notation (I think that was part of my concern a long time ago). Thoughts? .Seth -- Seth Hall * Corelight, Inc * www.corelight.com From robin at icir.org Mon Nov 6 09:33:06 2017 From: robin at icir.org (Robin Sommer) Date: Mon, 6 Nov 2017 09:33:06 -0800 Subject: [Bro-Dev] Scientific notation? In-Reply-To: References: Message-ID: <20171106173306.GA59680@icir.org> On Mon, Nov 06, 2017 at 09:16 -0500, you wrote: > versions of awk and they all support scientific notation I'm wondering if that's true for other log parsers as well. The main thing I'd want to avoid is breaking people's existing scripts. We could make it an option? Robin -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From dnthayer at illinois.edu Mon Nov 6 10:10:47 2017 From: dnthayer at illinois.edu (Daniel Thayer) Date: Mon, 6 Nov 2017 12:10:47 -0600 Subject: [Bro-Dev] Scientific notation? In-Reply-To: References: Message-ID: On 11/6/17 8:16 AM, Seth Hall wrote: > Right now, Bro will print scientific notation in JSON logs but we've > always tended to avoid it in the standard Bro log format. What does > everyone think about switching to allow scientific notation in the > standard log format? Daniel recently did some exploration of various > versions of awk and they all support scientific notation (I think that > was part of my concern a long time ago). > > Thoughts? > > .Seth Actually, right now Bro uses scientific notation in JSON logs only for very large values (such as 3.1e+15). For values very close to zero (such as 1.2e-7), Bro will write "0" to a JSON log. From jsiwek at illinois.edu Mon Nov 6 10:41:00 2017 From: jsiwek at illinois.edu (Siwek, Jon) Date: Mon, 6 Nov 2017 18:41:00 +0000 Subject: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b) In-Reply-To: <546059cf-a97b-2767-1ef0-b7ad3bb46e36@gmail.com> References: <201710271803.v9RI3oSQ001411@bro-ids.icir.org> <20171031181607.GB26741@icir.org> <20171101212333.GA29502@icir.org> <2336CB3B-DAA8-474F-83FC-252A59598DC4@illinois.edu> <20171102183744.GE33335@MacPro-2331.local> <3456002A-BA7E-4749-9ABB-0C7A03E9F5EC@illinois.edu> <80b63959-df91-2beb-76af-ea8d67a253ee@gmail.com> <546059cf-a97b-2767-1ef0-b7ad3bb46e36@gmail.com> Message-ID: <293E6935-B20B-432D-862C-EC10F66BB36F@illinois.edu> > 2) Let the developer specify constraints for the data service > distribution across data nodes and automatize the optimization. The > minimal example would be that for each data service a minimum and > maximum or default number of data nodes is specified (e.g. Intel on 1-2 > nodes and Scan detection on all available nodes). More complex > specifications could require that a data service isn't scheduled on data > nodes together with (particular) other services. I like the idea of having some algorithm than can automatically allocate nodes into pools and think maybe it could also be done in a way that provides a sane default yet is still customizable enough for users, at least for the most common use-cases. It seems so far we can roughly group the needs of script developers into 2 categories: they either have a data set that can trivial be partitioned across data nodes or they have a data set that doesn?t. The best we can provide for the later is replication/redundancy and also giving them exclusive/isolated reign of a node or set of nodes. An API that falls out from that is: type Cluster::Pool: record { # mostly opaque... }; type Cluster::PoolSpec: record { topic: string; node_type: Cluster::node_type &default = Cluster::DATA; max_nodes: int &default = -1; # negative number means "all available nodes" exclusive: bool &default = F; }; global Cluster::register_pool(spec: PoolSpec): Pool; Example script-usage: global Intel::pool: Cluster::Pool; const Intel::max_pool_nodes = +2 &redef; const Intel::use_exclusive_pool_nodes = F &redef; const Intel::pool_spec = Cluster::PoolSpec( $topic = ?bro/cluster/pool/intel?, $max_nodes = Intel::max_pool_nodes, $exclusive = Intel::use_exclusive_pool_nodes, ) &redef; event bro_init() { Intel::pool = Cluster::register_pool(Intel::pool_spec); } And other scripts would be similar except their default $max_nodes is still -1, using all available nodes. I think this makes the user-experience also straightforward: the default configuration will always be functional and the scaling procedure is still mostly ?just add more data nodes? and occasionally either ?toggle the $exclusive flag? or ?increase $max_nodes? depending on the user?s circumstance. The later options don?t necessarily address the fundamental scaling issue for the user completely, but it seems like maybe the best we can do at least at this level of abstraction. - Jon From jmellander at lbl.gov Mon Nov 6 12:03:16 2017 From: jmellander at lbl.gov (Jim Mellander) Date: Mon, 6 Nov 2017 12:03:16 -0800 Subject: [Bro-Dev] Scientific notation? In-Reply-To: References: Message-ID: How about a user redef'able format string for doubles in logs? Even more flexible would be to make it a function. Let the user decide the format they need, and adapt their scripts accordingly, with the default being the current behavior. On Mon, Nov 6, 2017 at 10:10 AM, Daniel Thayer wrote: > On 11/6/17 8:16 AM, Seth Hall wrote: > > Right now, Bro will print scientific notation in JSON logs but we've > > always tended to avoid it in the standard Bro log format. What does > > everyone think about switching to allow scientific notation in the > > standard log format? Daniel recently did some exploration of various > > versions of awk and they all support scientific notation (I think that > > was part of my concern a long time ago). > > > > Thoughts? > > > > .Seth > > Actually, right now Bro uses scientific notation in JSON logs only > for very large values (such as 3.1e+15). For values very close to > zero (such as 1.2e-7), Bro will write "0" to a JSON log. > _______________________________________________ > bro-dev mailing list > bro-dev at bro.org > http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.icsi.berkeley.edu/pipermail/bro-dev/attachments/20171106/667c6c4a/attachment.html From seth at corelight.com Tue Nov 7 06:46:26 2017 From: seth at corelight.com (Seth Hall) Date: Tue, 07 Nov 2017 09:46:26 -0500 Subject: [Bro-Dev] Scientific notation? In-Reply-To: References: Message-ID: On 6 Nov 2017, at 15:03, Jim Mellander wrote: > How about a user redef'able format string for doubles in logs? Even > more > flexible would be to make it a function. Let the user decide the > format > they need, and adapt their scripts accordingly, with the default being > the > current behavior. Ah, I like that idea. I think the current logs (both JSON formatted and normal Bro format) are fine for most people, but for the people that actually want doubles displayed differently this could give them that option. .Seth -- Seth Hall * Corelight, Inc * www.corelight.com From jsiwek at illinois.edu Wed Nov 8 10:46:36 2017 From: jsiwek at illinois.edu (Siwek, Jon) Date: Wed, 8 Nov 2017 18:46:36 +0000 Subject: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b) In-Reply-To: <293E6935-B20B-432D-862C-EC10F66BB36F@illinois.edu> References: <201710271803.v9RI3oSQ001411@bro-ids.icir.org> <20171031181607.GB26741@icir.org> <20171101212333.GA29502@icir.org> <2336CB3B-DAA8-474F-83FC-252A59598DC4@illinois.edu> <20171102183744.GE33335@MacPro-2331.local> <3456002A-BA7E-4749-9ABB-0C7A03E9F5EC@illinois.edu> <80b63959-df91-2beb-76af-ea8d67a253ee@gmail.com> <546059cf-a97b-2767-1ef0-b7ad3bb46e36@gmail.com> <293E6935-B20B-432D-862C-EC10F66BB36F@illinois.edu> Message-ID: Just a quick summary of key points of this thread related to cluster-layout, messaging patterns, and API (omitting some minor stuff from Robin?s initial feedback). - "proxy" nodes will be renamed at a later point toward the end of the project ("proxy" actually makes sense to me, but "data" seems to have caught on so I'll go w/ that unless there's other suggestions) - "data" nodes will connect within clusters differently than previous "proxy" nodes. Each worker connects to every data node. Data nodes do not connect with each other. - instead of sending logs statically to Broker::log_topic, there will now be a "const Broker::log_topic_func = function(id: Log::ID, path: string) &redef" to better support multiple loggers and failover use-cases - add new, explicit message routing or one-hop relaying (e.g. for the simple use-case of "broadcast from this worker to all workers?) - add a more flexible pool membership API to let scripters define their own data pool constraints that users can then customize (outlined in previous email) Let me know if I missed anything. - Jon From robin at icir.org Thu Nov 9 09:56:57 2017 From: robin at icir.org (Robin Sommer) Date: Thu, 9 Nov 2017 09:56:57 -0800 Subject: [Bro-Dev] [Bro-Commits] [git/bro] topic/actor-system: First-pass broker-enabled Cluster scripting API + misc. (07ad06b) In-Reply-To: References: <20171102183744.GE33335@MacPro-2331.local> <3456002A-BA7E-4749-9ABB-0C7A03E9F5EC@illinois.edu> <80b63959-df91-2beb-76af-ea8d67a253ee@gmail.com> <546059cf-a97b-2767-1ef0-b7ad3bb46e36@gmail.com> <293E6935-B20B-432D-862C-EC10F66BB36F@illinois.edu> Message-ID: <20171109175657.GF72785@icir.org> Sounds good to me. We should probably label the new parts experimental for now, as I'm sure we'll iterate some more as people get experience with them. Robin On Wed, Nov 08, 2017 at 18:46 +0000, you wrote: > Just a quick summary of key points of this thread related to cluster-layout, messaging patterns, and API (omitting some minor stuff from Robin?s initial feedback). > > - "proxy" nodes will be renamed at a later point toward the end of the project > ("proxy" actually makes sense to me, but "data" seems to have caught on > so I'll go w/ that unless there's other suggestions) > > - "data" nodes will connect within clusters differently than previous "proxy" > nodes. Each worker connects to every data node. Data nodes do not connect > with each other. > > - instead of sending logs statically to Broker::log_topic, there will now be > a "const Broker::log_topic_func = function(id: Log::ID, path: string) &redef" > to better support multiple loggers and failover use-cases > > - add new, explicit message routing or one-hop relaying (e.g. for the simple > use-case of "broadcast from this worker to all workers?) > > - add a more flexible pool membership API to let scripters define their own data > pool constraints that users can then customize (outlined in previous email) > > Let me know if I missed anything. > > - Jon > > _______________________________________________ > bro-dev mailing list > bro-dev at bro.org > http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev > -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From logan at blackhillsinfosec.com Tue Nov 28 12:40:28 2017 From: logan at blackhillsinfosec.com (Logan Lembke) Date: Tue, 28 Nov 2017 13:40:28 -0700 Subject: [Bro-Dev] Log::set_buf Ineffective Before First Write Message-ID: Hello, I am currently writing a Bro IDS logging plugin for logging to MongoDB. We have implemented both buffered and unbuffered writes and rely on WriterBackend::DoSetBuf to be called in order to switch between the approaches. Currently, we use a bro script which attaches our plugin to the Conn log and calls Log::set_buf in order to configure the buffering behavior. However, DoSetBuf never gets called on our plugin. In Manager.cc, Manager::SetBuf loops over the list of writers registered with a given stream and calls the SetBuf method on each of the WriterFrontends. Unfortunately, this list of registered writers is empty before the first write, as writers are initialized as they are needed in the Manager::Write method. Effectively, this prevents configuring buffering behavior before the first write occurs. I'm new to the Bro code base, but I believe a fix could be made by storing the buffering behavior on the stream and checking this behavior on writer initialization. Here is the bro script I am currently using, The source code for the plugin is at https://github.com/ocmdev/bro-mongodb/tree/optionalBuffer. Does this look like a valid problem? Logan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.icsi.berkeley.edu/pipermail/bro-dev/attachments/20171128/80cb3c66/attachment.html From 740052959 at qq.com Wed Nov 29 04:13:58 2017 From: 740052959 at qq.com (=?ISO-8859-1?B?TmVpbA==?=) Date: Wed, 29 Nov 2017 20:13:58 +0800 Subject: [Bro-Dev] How to analyse the source code? Message-ID: I want do any modify about Bro. Howevery I don't know how to analyse the source code? Can you offer some document about the source code? Whice tools did you used to editor and compile the code? Thank you very much! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.icsi.berkeley.edu/pipermail/bro-dev/attachments/20171129/271d8bb4/attachment.html From jan.grashoefer at gmail.com Wed Nov 29 04:35:21 2017 From: jan.grashoefer at gmail.com (=?UTF-8?Q?Jan_Grash=c3=b6fer?=) Date: Wed, 29 Nov 2017 13:35:21 +0100 Subject: [Bro-Dev] Log::set_buf Ineffective Before First Write In-Reply-To: References: Message-ID: Hi Logan, On 28/11/17 21:40, Logan Lembke wrote: > Effectively, this prevents configuring buffering behavior before the first > write occurs. I'm new to the Bro code base, but I believe a fix could be > made by storing the buffering behavior on the stream and checking this > behavior on writer initialization. > > ... > > Does this look like a valid problem? I haven't had a look at your code but the issue reminds me of https://bro-tracker.atlassian.net/browse/BIT-1441. If you continue digging into this, you might be able to solve both issues. Jan From johanna at corelight.com Wed Nov 29 14:02:24 2017 From: johanna at corelight.com (Johanna Amann) Date: Wed, 29 Nov 2017 14:02:24 -0800 Subject: [Bro-Dev] Feedback on configuration framework implementation Message-ID: <20171129220224.ciiekzsc36of57r6@Trafalgar.local> Hello everyone, the branch topic/johanna/config contains an implementation of the configuration framework as it was discussed in an earlier thread on this list. GitHub link: https://github.com/bro/bro/compare/topic/johanna/config The implementation is basically what we discussed in the earlier thread with some additional components like a reader for configuration values and a script-level framework. It would be great if people could take a look at all of this and see if this makes sense, or if they see any problems with the implementation as it is at the moment. In the rest of the mail I wil go into a bit more detail and describe the different parts of this change. Note that the rest of this email will be very similar to the git commit message which also describes this change :) The configuration framework consists of three mostly distinct parts: * option variables * the config reader * the script level framework option variable =============== The option keyword allows variables to be specified as run-tine options. Such variables cannot be changed using normal assignments. Instead, they can be changed using Option::set. It is possible to "subscribe" to options and be notified when an option value changes. Change handlers can also change values before they are applied; this gives them the opportunity to reject changes. Priorities can be specified if there are several handlers for one option. Example script: option testbool: bool = T; function option_changed(ID: string, new_value: bool): bool { print fmt("Value of %s changed from %s to %s", ID, testbool, new_value); return new_value; } event bro_init() { print "Old value", testbool; Option::set_change_handler("testbool", option_changed); Option::set("testbool", F); print "New value", testbool; } config reader ============= The config reader provides a way to read configuration files back into Bro. Most importantly it automatically converts values to the correct types. This is important because it is at least inconvenient (and sometimes near impossible) to perform the necessary type conversions in Bro scripts themselves. This is especially true for sets/vectors. Configuration generally look like this: [option name][tab/spaces][new variable value] so, for example: testaddr 2607:f8b0:4005:801::200e testinterval 60 testtime 1507321987 test_set a b c d erdbeerschnitzel The reader uses the option name to look up the type that variable has in the Bro core and automatically converts the value to the correct type. Example script use: type Idx: record { option_name: string; }; type Val: record { option_val: string; }; global currconfig: table[string] of string = table(); event InputConfig::new_value(name: string, source: string, id: string, value: any) { print id, value; } event bro_init() { Input::add_table([$reader=Input::READER_CONFIG, $source="../configfile", $name="configuration", $idx=Idx, $val=Val, $destination=currconfig, $want_record=F]); } Script-level config framework ============================= The script-level framework ties these two features together and makes them a bit more convenient to use. Configuration files can simply be specified by placing them into Config::config_files. The framework also creates a config.log that shows all value changes that took place. Usage example: redef Config::config_files += {configfile}; export { option testbool : bool = F; } The file is now monitored for changes; when a change occurs the respective option values are automatically updated and the value change is written to config.log. Other changes ============= Internally, this commit also performs a range of changes to the Input manager; it marks a lot of functions as const and introduces a new ValueToVal method (which could in theory replace the already existing one - it is a bit more powerful). This also changes SerialTypes to have a subtype for Values, just as Fields already have it; I think it was mostly an oversight that this was not introduced from the beginning. This should not necessitate any code changes for people already using SerialTypes. Johanna From jan.grashoefer at gmail.com Thu Nov 30 09:57:57 2017 From: jan.grashoefer at gmail.com (=?UTF-8?Q?Jan_Grash=c3=b6fer?=) Date: Thu, 30 Nov 2017 18:57:57 +0100 Subject: [Bro-Dev] Feedback on configuration framework implementation In-Reply-To: <20171129220224.ciiekzsc36of57r6@Trafalgar.local> References: <20171129220224.ciiekzsc36of57r6@Trafalgar.local> Message-ID: <28946f51-7733-73e5-9c13-839e5dcf8ac4@gmail.com> On 29/11/17 23:02, Johanna Amann wrote: > Hello everyone, > > the branch topic/johanna/config contains an implementation of the > configuration framework as it was discussed in an earlier thread on this > list. GitHub link: https://github.com/bro/bro/compare/topic/johanna/config Nice! I am curious to see all the usability improvements that can be built on top of this. > It would be great if people could take a look at all of this and see if > this makes sense, or if they see any problems with the implementation as > it is at the moment. Having I quick look at the documentation, everything seems well though-out to me. I have just two small questions: > option variable > =============== > > The option keyword allows variables to be specified as run-tine options. > Such variables cannot be changed using normal assignments. Instead, they > can be changed using Option::set. It is possible to "subscribe" to > options and be notified when an option value changes. > > Change handlers can also change values before they are applied; this > gives them the opportunity to reject changes. Priorities can be > specified if there are several handlers for one option. 1. Thinking of handlers that may change values and are associated with a priority, hooks come to my mind (e.g. Intel::extend_match). Are functions preferable compared to hooks here? > config reader > ============= > > The config reader provides a way to read configuration files back into > Bro. Most importantly it automatically converts values to the correct > types. This is important because it is at least inconvenient (and > sometimes near impossible) to perform the necessary type conversions in > Bro scripts themselves. This is especially true for sets/vectors. > > Configuration generally look like this: > > [option name][tab/spaces][new variable value] 2. Are module namespaces part of the option name (e.g. "Notice::reply_to" vs. "reply_to")? Thanks, Jan From johanna at corelight.com Thu Nov 30 10:01:19 2017 From: johanna at corelight.com (Johanna Amann) Date: Thu, 30 Nov 2017 10:01:19 -0800 Subject: [Bro-Dev] Feedback on configuration framework implementation In-Reply-To: <28946f51-7733-73e5-9c13-839e5dcf8ac4@gmail.com> References: <20171129220224.ciiekzsc36of57r6@Trafalgar.local> <28946f51-7733-73e5-9c13-839e5dcf8ac4@gmail.com> Message-ID: <20171130180119.ssqmgzy4hgosrlht@Beezling.local> > 1. Thinking of handlers that may change values and are associated with a > priority, hooks come to my mind (e.g. Intel::extend_match). Are > functions preferable compared to hooks here? In this case - yes. The problem with hooks is that they cannot return a value, which is used here to let user change (or reject) changes to options. :) > > config reader > > ============= > > > > The config reader provides a way to read configuration files back into > > Bro. Most importantly it automatically converts values to the correct > > types. This is important because it is at least inconvenient (and > > sometimes near impossible) to perform the necessary type conversions in > > Bro scripts themselves. This is especially true for sets/vectors. > > > > Configuration generally look like this: > > > > [option name][tab/spaces][new variable value] > > 2. Are module namespaces part of the option name (e.g. > "Notice::reply_to" vs. "reply_to")? Yes Johanna From jan.grashoefer at gmail.com Thu Nov 30 10:22:27 2017 From: jan.grashoefer at gmail.com (=?UTF-8?Q?Jan_Grash=c3=b6fer?=) Date: Thu, 30 Nov 2017 19:22:27 +0100 Subject: [Bro-Dev] Feedback on configuration framework implementation In-Reply-To: <20171130180119.ssqmgzy4hgosrlht@Beezling.local> References: <20171129220224.ciiekzsc36of57r6@Trafalgar.local> <28946f51-7733-73e5-9c13-839e5dcf8ac4@gmail.com> <20171130180119.ssqmgzy4hgosrlht@Beezling.local> Message-ID: On 30/11/17 19:01, Johanna Amann wrote: >> 1. Thinking of handlers that may change values and are associated with a >> priority, hooks come to my mind (e.g. Intel::extend_match). Are >> functions preferable compared to hooks here? > > In this case - yes. The problem with hooks is that they cannot return a > value, which is used here to let user change (or reject) changes to > options. :) The Intel::extend_match hook allows changing values or rejecting as well. If the "chain of hooks" is "broken", i.e. one hook executed a break statement, the call to the hook returns false and (in that case) the log write is rejected. Otherwise, all changes made to the hook arguments inside the handlers are propagated allowing changes. Jan From johanna at corelight.com Thu Nov 30 10:28:19 2017 From: johanna at corelight.com (Johanna Amann) Date: Thu, 30 Nov 2017 10:28:19 -0800 Subject: [Bro-Dev] Feedback on configuration framework implementation In-Reply-To: References: <20171129220224.ciiekzsc36of57r6@Trafalgar.local> <28946f51-7733-73e5-9c13-839e5dcf8ac4@gmail.com> <20171130180119.ssqmgzy4hgosrlht@Beezling.local> Message-ID: <9F9FE5A5-044F-4AEA-B4AE-6F39F66AA18E@corelight.com> On 30 Nov 2017, at 10:22, Jan Grash?fer wrote: > On 30/11/17 19:01, Johanna Amann wrote: >>> 1. Thinking of handlers that may change values and are associated >>> with a >>> priority, hooks come to my mind (e.g. Intel::extend_match). Are >>> functions preferable compared to hooks here? >> >> In this case - yes. The problem with hooks is that they cannot return >> a >> value, which is used here to let user change (or reject) changes to >> options. :) > > The Intel::extend_match hook allows changing values or rejecting as > well. If the "chain of hooks" is "broken", i.e. one hook executed a > break statement, the call to the hook returns false and (in that case) > the log write is rejected. Otherwise, all changes made to the hook > arguments inside the handlers are propagated allowing changes. Ah, you have a point there it is possible to do it like that, I did not think of that. I honestly also never liked modifying the values that are passed in arguments; this is for example also theoretically possible for events, but something that we have avoided to use in practice so far. Functionally they are, however, obviously equivalent. I think I prefer functions in this case from a stylistic point of view. I am happy to change it over to hooks though if there is a consensus that that is the more fitting approach here. :) Johanna From jan.grashoefer at gmail.com Thu Nov 30 11:04:39 2017 From: jan.grashoefer at gmail.com (=?UTF-8?Q?Jan_Grash=c3=b6fer?=) Date: Thu, 30 Nov 2017 20:04:39 +0100 Subject: [Bro-Dev] Feedback on configuration framework implementation In-Reply-To: <9F9FE5A5-044F-4AEA-B4AE-6F39F66AA18E@corelight.com> References: <20171129220224.ciiekzsc36of57r6@Trafalgar.local> <28946f51-7733-73e5-9c13-839e5dcf8ac4@gmail.com> <20171130180119.ssqmgzy4hgosrlht@Beezling.local> <9F9FE5A5-044F-4AEA-B4AE-6F39F66AA18E@corelight.com> Message-ID: <7bd65df4-3c7a-46be-0fc2-f29334bdbba5@gmail.com> On 30/11/17 19:28, Johanna Amann wrote: > I think I prefer functions in this case from a stylistic point of view. > I am happy to change it over to hooks though if there is a consensus > that that is the more fitting approach here. :) I like the hook approach that is used in the Intel-Framework but as I am biased, I will abstain ;) Jan