From seth at icir.org Wed Feb 1 07:00:23 2017 From: seth at icir.org (Seth Hall) Date: Wed, 1 Feb 2017 10:00:23 -0500 Subject: [Bro-Dev] [Bro] ActiveHTTP In-Reply-To: <50DE7D75-FA15-46DA-A6A8-FD13B6DDBF90@pingtrip.com> References: <92596923-DDD9-4879-8E38-5776154B4ADF@pingtrip.com> <50DE7D75-FA15-46DA-A6A8-FD13B6DDBF90@pingtrip.com> Message-ID: > On Jan 28, 2017, at 9:15 AM, Dave Crawford wrote: > > And the second print doesn?t execute: > > $ bro -r test.pcap local ../test.bro > > Entering the ActiveHTTP::Request when() block... > > I have ?exit_only_after_terminate? set to true so it just hangs at this point until I ctrl-c and I see the tmp files deleted. Following on this ticket from the main Bro list, I wonder if we could change the behavior of Bro slightly to make what Dave tried work? I *think* the problem here is that once the packets run out, Bro's internal clock stops moving forward which causes all sorts of trouble for timers and other stuff I'm sure. What does everyone think about making the clock continue to move forward even after the packet source runs dry? This especially makes sense when someone uses pseudo-realtime because we can keep moving the clock at the rate it was moving (but not jump to current time, we'd just do subtraction based on the time when the packet source ran dry). The main problem I see with this idea is if someone reads a PCAP at full speed, what rate do we make the clock continue ticking? Does this idea make sense at all? I think we've had too many new Bro programmers get frustrated with this behavior which worries me a little bit. .Seth -- Seth Hall International Computer Science Institute (Bro) because everyone has a network http://www.bro.org/ From seth at icir.org Wed Feb 1 07:02:45 2017 From: seth at icir.org (Seth Hall) Date: Wed, 1 Feb 2017 10:02:45 -0500 Subject: [Bro-Dev] Broker's remote logging (BIT-1784) In-Reply-To: <20170131224145.GH77787@icir.org> References: <20170131224145.GH77787@icir.org> Message-ID: <783DDBE2-2A8F-4911-8A1F-AED541C4F726@icir.org> > On Jan 31, 2017, at 5:41 PM, Robin Sommer wrote: > > For RemoteSerializer, log events and log > filters run only on the originating nodes; those guys make all > decisions about what's getting logged exactly and they then send that > on to the manager, which just writes out the data it receives. Just to pile onto this, I think it should be this way too. I'd really like to avoid script code executing on the logger. .Seth -- Seth Hall International Computer Science Institute (Bro) because everyone has a network http://www.bro.org/ From jsiwek at illinois.edu Sun Feb 5 14:04:04 2017 From: jsiwek at illinois.edu (Siwek, Jon) Date: Sun, 5 Feb 2017 22:04:04 +0000 Subject: [Bro-Dev] Broker's remote logging (BIT-1784) In-Reply-To: <2D68E22E-5DA6-40E5-A917-7D5E146DF1AE@illinois.edu> References: <20170131224145.GH77787@icir.org> <2D68E22E-5DA6-40E5-A917-7D5E146DF1AE@illinois.edu> Message-ID: > On Jan 31, 2017, at 5:41 PM, Azoff, Justin S wrote: > >> I'm wondering if there's a reason that in the Broker case things >> *have* to be this way. Is there something that prevents the Broker >> manager from doing the same as the RemoteSerializer? > > Jon would know best, but I'd guess one form was more convenient to use than the other and it may have been assumed that they both did the same thing. I think I was aware of the differences and went ahead with that approach because there's the extra technical work of writing code to convert value types as Robin mentions and also it's conceptually more flexible than the old way. I understand the argument that the old semantics (manager not running log events/filters) may be more performant, though, I?d consider whether the internal comm. framework or the base/user scripts should be the one to decide. I think the later is better, so the problem breaks down into (1) does the user have the ability to fully control whether log events/filters run on any given node via scripts? and (2) are the default settings/scripts sane for the common use-case? (1) is likely true, so (2) sounds like it needs to be fixed. Just a different idea on how to approach solving the issue without having to touch the framework's internals. (it?s been a while, hope it?s not way off base) - Jon From robin at icir.org Mon Feb 6 08:43:39 2017 From: robin at icir.org (Robin Sommer) Date: Mon, 6 Feb 2017 08:43:39 -0800 Subject: [Bro-Dev] Broker's remote logging (BIT-1784) In-Reply-To: References: <20170131224145.GH77787@icir.org> <2D68E22E-5DA6-40E5-A917-7D5E146DF1AE@illinois.edu> Message-ID: <20170206164339.GA40228@icir.org> On Sun, Feb 05, 2017 at 22:04 +0000, you wrote: > I understand the argument that the old semantics (manager not running > log events/filters) may be more performant, though, I?d consider > whether the internal comm. framework or the base/user scripts should > be the one to decide. I agree that generally it'd be nice to be able to do it either way. However, I'm pretty sure at this point that we need the separate high-performance path that the old communication introduced, for the reasons discussed in this thread and also for consistency. I'm working on adding that code, and I think it should be the standard model, just as it is currently. In addition to that, one can always send log_* events around and then do custom processing on the receiving side. That's not quite as transparent as "normal" log messages would be with their configuration and filters, but that might actually be a good thing: if we actually had both mechanisms (sender- and receiver-side filtering) built transparently into the logging framework, it could end up being quite confusing what's used when. I propose that for now we make Broker work like the current model, and then we go from there. If we need more, we can add that later. The less semantic differences we have between old and new communication, the easier the switch will be. Robin -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From jsiwek at illinois.edu Mon Feb 6 18:46:42 2017 From: jsiwek at illinois.edu (Siwek, Jon) Date: Tue, 7 Feb 2017 02:46:42 +0000 Subject: [Bro-Dev] Broker's remote logging (BIT-1784) In-Reply-To: <20170206164339.GA40228@icir.org> References: <20170131224145.GH77787@icir.org> <2D68E22E-5DA6-40E5-A917-7D5E146DF1AE@illinois.edu> <20170206164339.GA40228@icir.org> Message-ID: <296FFE37-DCA9-4F4D-90BB-6CC621EFC1E4@illinois.edu> > On Feb 6, 2017, at 10:43 AM, Robin Sommer wrote: > > I propose that for now we make Broker work like the current model, and > then we go from there. If we need more, we can add that later. The > less semantic differences we have between old and new communication, > the easier the switch will be. Yeah, I agree that it should behave the same. Was just also trying to reframe the problem to give ideas for solutions that give back old semantics (i.e. via script-level mechanisms). No big deal, since there?s no demand for it. > In addition to that, one can always send log_* > events around and then do custom processing on the receiving side. > That's not quite as transparent as "normal" log messages would be with > their configuration and filters, but that might actually be a good > thing: if we actually had both mechanisms (sender- and receiver-side > filtering) built transparently into the logging framework, it could > end up being quite confusing what's used when. It was actually always confusing to me that a remote log entry versus a local log entry would be processed differently regarding the log_* events. The event is a property of the Log::Stream definition and the logging API or docs don?t distinguish between outgoing versus incoming log entries there, or do they? Or is a Stream meant to be thought of from only the perspective of an outgoing sequence of log entries? Could be I misunderstood the log framework the whole time, and that?s why broker behaved the way it did :) - Jon From robin at icir.org Tue Feb 7 08:19:26 2017 From: robin at icir.org (Robin Sommer) Date: Tue, 7 Feb 2017 08:19:26 -0800 Subject: [Bro-Dev] Broker's remote logging (BIT-1784) In-Reply-To: <296FFE37-DCA9-4F4D-90BB-6CC621EFC1E4@illinois.edu> References: <20170131224145.GH77787@icir.org> <2D68E22E-5DA6-40E5-A917-7D5E146DF1AE@illinois.edu> <20170206164339.GA40228@icir.org> <296FFE37-DCA9-4F4D-90BB-6CC621EFC1E4@illinois.edu> Message-ID: <20170207161926.GE43519@icir.org> On Tue, Feb 07, 2017 at 02:46 +0000, you wrote: > It was actually always confusing to me that a remote log entry versus > a local log entry would be processed differently regarding the log_* > events. I know, it's a bit confusing. Some of that is historic and part of trying to maintain semantics as things were involving (both logging framework and communication; quite similar actually to what we've been discussing here: what should be done where). It all came out of "remote printing" where a print-statement would just send what it would normally print into a file, over to another node---that means everything was fully procecessed already as it was received. The other part is the performance optimization: special-casing log transmission for batching and volume, so that it doesn't become a bottleneck. Thinking about it as just outgoing entries fits its best I think. Onw the receiving side, the entries don't "really" enter the flu logging framework, they just take a fast path directly into the writers. One thing I'm doing is renaming methods to make that bit clearer; the two Write() methods are clearly misleading. Robin -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From justin.oursler at gmail.com Wed Feb 8 12:26:54 2017 From: justin.oursler at gmail.com (Justin Oursler) Date: Wed, 8 Feb 2017 15:26:54 -0500 Subject: [Bro-Dev] Packet Signature, Protocol, and Analyzer Relationship Message-ID: Hello, I am writing a new analyzer and plugin for a TCP Application protocol. Can someone help explain the relationship among the protocol, the analyzer, and the dynamic signature files? The reason I ask is I have a payload regex in dpd.sig that will match on packets and log. Then, if I start adding to and changing my-proto-protocol.pac (while keeping the arguments the same that gets passed to the event), Bro's debug will say it matches on the dpd.sig for my protocol, but it will not produce a log for my protocol. So, I think I'm missing a fundamental process of Bro processing a packet. Why does changing my-proto-protocol.pac affect what gets logged? Thanks, Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.icsi.berkeley.edu/pipermail/bro-dev/attachments/20170208/cea8bc12/attachment.html From jazoff at illinois.edu Wed Feb 8 12:36:26 2017 From: jazoff at illinois.edu (Azoff, Justin S) Date: Wed, 8 Feb 2017 20:36:26 +0000 Subject: [Bro-Dev] Packet Signature, Protocol, and Analyzer Relationship In-Reply-To: References: Message-ID: <7673157E-E708-4F28-BF08-556A94E9C3AB@illinois.edu> > On Feb 8, 2017, at 3:26 PM, Justin Oursler wrote: > > Hello, > > I am writing a new analyzer and plugin for a TCP Application protocol. Can someone help explain the relationship among the protocol, the analyzer, and the dynamic signature files? Bro either attaches an analyzer to a connection based on the likely port (like 80 for http) or via a signature (/GET.../) so it can find the protocol on non-standard ports. The analyzer can then confirm that it is seeing the protocol it expects to or not. > The reason I ask is I have a payload regex in dpd.sig that will match on packets and log. Which log are you talking about? the dpd.log? or my-protocol.log? > Then, if I start adding to and changing my-proto-protocol.pac (while keeping the arguments the same that gets passed to the event), Bro's debug will say it matches on the dpd.sig for my protocol, but it will not produce a log for my protocol. So, I think I'm missing a fundamental process of Bro processing a packet. Why does changing my-proto-protocol.pac affect what gets logged? Without more information, the most likely explanation is that the change you are making to the .pac file is breaking the analyzer and causing events to no longer be generated and nothing to be logged. -- - Justin Azoff From jsiwek at illinois.edu Wed Feb 8 22:26:37 2017 From: jsiwek at illinois.edu (Siwek, Jon) Date: Thu, 9 Feb 2017 06:26:37 +0000 Subject: [Bro-Dev] Improving Bro's main loop Message-ID: Just starting a discussion to take inventory of the current problems with Bro?s main loop and ideas for how to improve it. Let?s begin with a list of issues (please comment if you have additions): (1) It uses select(), which is the worst polling mechanism. It has an upper limit on number of fds that can be polled (some OSs are fixed at 1024), and also scales poorly. This is problematic for Bro clusters that have many nodes/peers. 2) Integrating new I/O sources isn?t always straightforward from a technical standpoint (e.g. see [1]). I also found that it?s difficult to understand the ramifications of any change to the run loop without also digging into esoteric details you may not initially think are related (e.g. I usually had to double-check the internals of I/O or threading systems when making any change to the main loop, which may mean there's basic problems with those abstractions). 3) Bro?s time/timers are coupled with I/O. Time does not move forward unless there is an active I/O source. This isn?t usually a functional problem for users, but devs occasionally have to hack around it (e.g. unit tests). I think CAF [2] and/or libuv [3] can address these issues: 1) libuv: abstracts whatever polling mechanism is best for the OS you?re on. CAF: could allow a more direct actor messaging interface to Broker and since remote communication takes the bulk of fds being polled, the remaining fds (e.g. packet sources, etc.) could be fine to poll in whatever fashion, while the remote communication then is subject to CAF?s own multiplexer. 2) Both libuv and CAF use abstractions/models that are shown to work well. I think the actor model, by design, does a better job of encouraging systems that are decoupled and therefore scalable. 3) Both libuv and CAF have mechanisms that could implement timers into the run loop such that they?d work independently of other I/O. libuv may be a quicker, more straightforward path to fixing (1), which is the most critical issue, but it?s also the easiest to fix without aid of a library. Libuv can also replace other misc. code in Bro like async DNS and signal handling, but, while those may be crufty, they aren?t frequent sources of pain. Since CAF is a requirement of Broker already and has most potential to improve/replace parts of Bro?s threading system and the way in which Broker is integrated, it may be best in the long-term to explore moving things out of Bro?s current run loop by making them into actors that use message-passing interfaces and then relying on CAF?s own loop. Any thoughts? - Jon [1] http://mailman.icsi.berkeley.edu/pipermail/bro-dev/2015-May/010069.html [2] https://actor-framework.org/ [3] http://docs.libuv.org/en/v1.x/ From robin at icir.org Thu Feb 9 10:02:04 2017 From: robin at icir.org (Robin Sommer) Date: Thu, 9 Feb 2017 10:02:04 -0800 Subject: [Bro-Dev] Improving Bro's main loop In-Reply-To: References: Message-ID: <20170209180204.GF74360@icir.org> Nice summary, I agree with all of the pain points. Without thinking much about solutions yet, a bit of random brainstorming on things to keep in mind when thinking about this: - We need to maintain some predictability in scheduling, in particular with regarding to timing/timers. Bro's network time time is, by definition, defined through I/O. My gut feeling is that we need to keep the tight coupling there, as otherwise semantics would change quite a bit. - Related, another reason for time playing such an important role in the I/O loop is that Bro needs to process its soonest input first. That's most important for packet sources: if we have packets coming from multiple packet sources, earlier timestamps must be processed before later ones across all of them. - Time is generally complex, we have three different notions of network time actually, all with some different specifics: time during real-time processing, time during offline trace processing, and pseudo-realtime. - I believe we need to maintain the ability to have I/O loops that don't have FDs. - I like the idea of using CAF, including because it's going to be a required dependency anyways in the future. I would also like it conceptually to move I/O to actors, and I'm wondering if even packets sources could go there. However, I can't quite tell if that's feasible given other constrains and how other parts of the system are layed out (including that in the end, everything needs to go back into the main thread before being further processed; at least for the time being). - One of the trickiest parts in the past has been ensuring good performance on a variety of platforms and OS versions. Whatever we do, it'll be important to do quite a bit of test-driving and benchmarking. Let's try to structure the work so that we can get to a prototype quickly that allows for some initial performance validation of the approach taken. Robin -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From jazoff at illinois.edu Thu Feb 9 12:21:09 2017 From: jazoff at illinois.edu (Azoff, Justin S) Date: Thu, 9 Feb 2017 20:21:09 +0000 Subject: [Bro-Dev] Scaling out bro cluster communication Message-ID: <1FFA7E0D-A383-4E96-8C6A-5406B73414B4@illinois.edu> I've been thinking about ideas for how to better scale out bro cluster communication and how that would look in scripts. Scaling out sumstats and known hosts/services/certs detection will require script language or bif changes. What I want to make possible is client side load balancing and failover for worker -> manager/datanode communication. I have 2 ideas for how things could work. ## The implicit form, new bifs like: send_event(dest: string, event: any); send_event_hashed(dest: string, hash_key: any, event: any); send_event("datanode", Scan::scan_attempt(scanner, attempt)); send_event_hashed("datanode", scanner, Scan::scan_attempt(scanner, attempt)); ## A super magic awesome implicit form global scan_attempt: event(scanner: addr, attempt: Attempt) &partition_via=func(scanner: addr, attempt: Attempt) { return scanner; } ; The implicit form fits better with how bro currently works, but I think the explicit form would ultimately make cluster aware scripts simpler. The difference hinges on the difference between the implicit and explicit communication. Currently all bro cluster communication is implicit: * You send logs to the logger/manager node by calling Log::write * You send notices to the manager by calling NOTICE * You can share data between nodes by marking a container as &synchronized. * You can send data to the manager by redef'ing Cluster::worker2manager_events The last two are what we need to replace/extend. As an example, in my scan.bro I want to send scan attempts up to the manager for correlation, so this means: # define event global scan_attempt: event(scanner: addr, attempt: Attempt); # route it to the manager redef Cluster::worker2manager_events += /Scan::scan_attempt/; # only handle it on the manager @if ( Cluster::local_node_type() == Cluster::MANAGER ) event Scan::scan_attempt(scanner: addr, attempt: Attempt) { add_scan_attempt(scanner, attempt); } @endif and then later in the worker code, finally # raise the event to send it down to the manager. event Scan::scan_attempt(scanner, attempt); If bro communication was more explicit, the script would just be # define event and handle on all nodes global scan_attempt: event(scanner: addr, attempt: Attempt); event Scan::scan_attempt(scanner: addr, attempt: Attempt) { add_scan_attempt(scanner, attempt); } # send the event directly to the manager node send_event("manager", Scan::scan_attempt(scanner, attempt)); Things like scan detection and known hosts/services tracking are easily partitioned, so if you had two datanodes for analysis: if (hash(scanner) % 2 == 0) send_event("datanode-0", Scan::scan_attempt(scanner, attempt)); else send_event("datanode-1", Scan::scan_attempt(scanner, attempt)); Which would be wrapped in a function: send_event_hashed("datanode", scanner, Scan::scan_attempt(scanner, attempt)); that would handle knowing how many active nodes there are and doing proper consistent hashing/failover, something like this: function send_event_hashed(dest: string, hash_key: any, event: any) { data_nodes = |Cluster::active_nodes[dest]|; # or whatever node = hash(hash_key) % data_nodes; node_name = Cluster::active_nodes[node]$name; send_event(node_name, event); } -- - Justin Azoff From jsiwek at illinois.edu Thu Feb 9 17:18:43 2017 From: jsiwek at illinois.edu (Siwek, Jon) Date: Fri, 10 Feb 2017 01:18:43 +0000 Subject: [Bro-Dev] Improving Bro's main loop In-Reply-To: <20170209180204.GF74360@icir.org> References: <20170209180204.GF74360@icir.org> Message-ID: <14A75660-601E-4FA9-B4D4-8ED63F190148@illinois.edu> > On Feb 9, 2017, at 12:02 PM, Robin Sommer wrote: > > - We need to maintain some predictability in scheduling, in > particular with regarding to timing/timers. Bro's network time > time is, by definition, defined through I/O. My gut feeling is > that we need to keep the tight coupling there, as otherwise > semantics would change quite a bit. > > - Related, another reason for time playing such an important role > in the I/O loop is that Bro needs to process its soonest input > first. That's most important for packet sources: if we have > packets coming from multiple packet sources, earlier timestamps > must be processed before later ones across all of them. > > - Time is generally complex, we have three different notions of > network time actually, all with some different specifics: time > during real-time processing, time during offline trace > processing, and pseudo-realtime. Also not sure to what degree coupling related to time/timers can be reduced, though I think at least an initial refactor of the run loop could be done such that it doesn?t change much related to how time currently works. Then maybe later or during the refactor, it will get easier to see what exactly can be improved. > - I believe we need to maintain the ability to have I/O loops that > don't have FDs. Yep, don?t think there will be a problem there. > - I like the idea of using CAF, including because it's going to be > a required dependency anyways in the future. I would also like > it conceptually to move I/O to actors, and I'm wondering if even > packets sources could go there. However, I can't quite tell if > that's feasible given other constrains and how other parts of > the system are layed out (including that in the end, everything > needs to go back into the main thread before being further > processed; at least for the time being). I do think even packet sources could get moved into actors. My initial idea for the main loop refactor is for it to be a single actor waiting for ?ready for processing? messages from IOSources, and then for each IOSource to be responsible for its own FD polling (if it needs it). That way, the main loop doesn?t care about FDs at all anymore and if an IOSource needs to poll FDs it can just use poll() in its own actor/thread for now (my guess is that most IOSources will just have a single FD to poll anyway or that the polling mechanism isn?t a very significant chunk of time for ones that may have more, but the only way to answer that is to actually do the performance testing.) > - One of the trickiest parts in the past has been ensuring good > performance on a variety of platforms and OS versions. Whatever > we do, it'll be important to do quite a bit of test-driving and > benchmarking. Let's try to structure the work so that we can get > to a prototype quickly that allows for some initial performance > validation of the approach taken. Sure. I was also expecting to try and just get something working without any significant overhauling of any of Bro?s systems. - Jon From seth at icir.org Fri Feb 10 06:30:45 2017 From: seth at icir.org (Seth Hall) Date: Fri, 10 Feb 2017 09:30:45 -0500 Subject: [Bro-Dev] Scaling out bro cluster communication In-Reply-To: <1FFA7E0D-A383-4E96-8C6A-5406B73414B4@illinois.edu> References: <1FFA7E0D-A383-4E96-8C6A-5406B73414B4@illinois.edu> Message-ID: <657182BE-C2D3-488F-9400-5BDFD80077C2@icir.org> > On Feb 9, 2017, at 3:21 PM, Azoff, Justin S wrote: > > What I want to make possible is client side load balancing and failover for worker -> manager/datanode communication. Yes! Load balancing and failover are great goals for this stuff. > ## A super magic awesome implicit form > > global scan_attempt: event(scanner: addr, attempt: Attempt) > &partition_via=func(scanner: addr, attempt: Attempt) { return scanner; } ; I'm not sure how much I like this model, but I'd need to think about it a bit more still. I agree that on the surface it feels magic and awesome but I'm worried we could get ourselves into situations that aren't easily resolvable with this model. > The implicit form fits better with how bro currently works, but I think the explicit form would ultimately make cluster aware scripts simpler. Agree on both points. > # define event and handle on all nodes > global scan_attempt: event(scanner: addr, attempt: Attempt); > event Scan::scan_attempt(scanner: addr, attempt: Attempt) > { > add_scan_attempt(scanner, attempt); > } > > # send the event directly to the manager node > send_event("manager", Scan::scan_attempt(scanner, attempt)); I do like the look of making this more explicit. The implicit event sharing behavior makes some stuff that feels like it should be easy end up being really difficult. Do you have thoughts on how you'd do things like if you want the manager to send an event to all workers or all data nodes? Another thing I think we need to address is that this behavior seamlessly falls back if someone isn't running a cluster. Do you expect your idea to do that? I know that in the current programming model, making this cluster aware but still work not on a cluster can be painful to create the right abstraction. .Seth -- Seth Hall International Computer Science Institute (Bro) because everyone has a network http://www.bro.org/ From robin at icir.org Fri Feb 10 07:40:42 2017 From: robin at icir.org (Robin Sommer) Date: Fri, 10 Feb 2017 07:40:42 -0800 Subject: [Bro-Dev] Improving Bro's main loop In-Reply-To: <14A75660-601E-4FA9-B4D4-8ED63F190148@illinois.edu> References: <20170209180204.GF74360@icir.org> <14A75660-601E-4FA9-B4D4-8ED63F190148@illinois.edu> Message-ID: <20170210154042.GJ1247@icir.org> On Fri, Feb 10, 2017 at 01:18 +0000, you wrote: > if an IOSource needs to poll FDs it can just use poll() in its own > actor/thread for now Yeah, one basic decision we'll have to make is how much logic to move into threads. Conceptually, that's the right thing to do, but we need to make sure the code is thread-safe, and it generally makes development and debugging harder in the future. CAF helps with all of that, but all the legacy code worries me in that regard. That said, the IOSources are pretty much self-contained and probably not very problematic in that way. (But then: having some code needing to be thread-safe, while other parts break every rule in the book in that regard, is also confusing; we have that challenge already with the logging and input frameworks.)) Robin -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From vallentin at icir.org Fri Feb 10 08:49:56 2017 From: vallentin at icir.org (Matthias Vallentin) Date: Fri, 10 Feb 2017 08:49:56 -0800 Subject: [Bro-Dev] Scaling out bro cluster communication In-Reply-To: <1FFA7E0D-A383-4E96-8C6A-5406B73414B4@illinois.edu> References: <1FFA7E0D-A383-4E96-8C6A-5406B73414B4@illinois.edu> Message-ID: <20170210164956.GB18262@samurai.icir.org> > What I want to make possible is client side load balancing and > failover for worker -> manager/datanode communication. This is an important part of future Bro deployments. Before delving into script code, I would like to get a better understanding of the underlying concepts and communication patterns. Once we have a clear picture what workloads we need to support, we can make architectural choices. Finally, the API falls out at the end. Concretely: can you describe (without Bro script code) what "client-side load-balancing and failover" means? Who is the client and what state needs to be resilient to failure? I don't think we have a working definition of "data node" either. My hunch is that they are involved in MapReduce computation and perhaps represent the reducers, but I'm not sure. Matthias From jazoff at illinois.edu Fri Feb 10 08:53:16 2017 From: jazoff at illinois.edu (Azoff, Justin S) Date: Fri, 10 Feb 2017 16:53:16 +0000 Subject: [Bro-Dev] Scaling out bro cluster communication In-Reply-To: <657182BE-C2D3-488F-9400-5BDFD80077C2@icir.org> References: <1FFA7E0D-A383-4E96-8C6A-5406B73414B4@illinois.edu> <657182BE-C2D3-488F-9400-5BDFD80077C2@icir.org> Message-ID: <5DB0029A-AD65-4184-9655-2FFF3090E9DC@illinois.edu> > On Feb 10, 2017, at 9:30 AM, Seth Hall wrote: > > >> On Feb 9, 2017, at 3:21 PM, Azoff, Justin S wrote: >> # define event and handle on all nodes >> global scan_attempt: event(scanner: addr, attempt: Attempt); >> event Scan::scan_attempt(scanner: addr, attempt: Attempt) >> { >> add_scan_attempt(scanner, attempt); >> } >> >> # send the event directly to the manager node >> send_event("manager", Scan::scan_attempt(scanner, attempt)); > > I do like the look of making this more explicit. The implicit event sharing behavior makes some stuff that feels like it should be easy end up being really difficult. Do you have thoughts on how you'd do things like if you want the manager to send an event to all workers or all data nodes? Hmm, perhaps there would be multiple functions: * One for sending an event to all nodes of a type * One for sending an event to a specific node * One for sending an event to one type of node based on a hash function Currently bro only does the first one (but by only having one manager or data node means that events sent to data nodes only go to one) Not being able to send events directly to an individual node also prevents bro scripts from doing RPC type queries. A worker can send the manager a query, but the manager can only raise a reply event that is sent to all workers. > Another thing I think we need to address is that this behavior seamlessly falls back if someone isn't running a cluster. Do you expect your idea to do that? I know that in the current programming model, making this cluster aware but still work not on a cluster can be painful to create the right abstraction. > > .Seth For falling back, if send_event("manager", Scan::scan_attempt(scanner, attempt)); was ran on the manager node it could skip broker and just raise the event locally. Currently bro has cluster specific code in intel,netcontrol,notice,openflow,packet-filter,sumstats.. so the current event system doesn't always just magically work on a cluster.. I don't think explicit send_event functions would change that at all. Plus, I'm not even sure if special-casing a non-cluster makes sense anymore. For example, scan detection on a single node doesn't need to do any cluster communication, it can just manage everything locally. But the code that handles scan detection is extremely simple: it consumes scan_attempt events and raises notices. What if a dedicated actor thread was started to handle the scan_attempt event? Then the code could do something like send_event("scan_aggregator", Scan::scan_attempt(scanner, attempt)); Which even on a single process instance could distribute the event to a thread dedicated to handling this work. -- - Justin Azoff From vlad at grigorescu.org Fri Feb 10 09:51:08 2017 From: vlad at grigorescu.org (Vlad Grigorescu) Date: Fri, 10 Feb 2017 11:51:08 -0600 Subject: [Bro-Dev] Splitting up init-bare? Message-ID: What do people think about splitting up portions of init-bare into separate files, and having init-bare simply @load those files? Right now, it's a 4500+ line script that keeps growing, and it commonly results in conflicts. For the protocols, I could see having a file such as protocols/kerberos/bare.bro which defines the appropriate types which are currently in init-bare. --Vlad -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.icsi.berkeley.edu/pipermail/bro-dev/attachments/20170210/14b3a2ae/attachment.html From jazoff at illinois.edu Fri Feb 10 09:52:48 2017 From: jazoff at illinois.edu (Azoff, Justin S) Date: Fri, 10 Feb 2017 17:52:48 +0000 Subject: [Bro-Dev] Scaling out bro cluster communication In-Reply-To: <20170210164956.GB18262@samurai.icir.org> References: <1FFA7E0D-A383-4E96-8C6A-5406B73414B4@illinois.edu> <20170210164956.GB18262@samurai.icir.org> Message-ID: <79E2D8B3-AA8D-4018-A172-4CB91DDEB13C@illinois.edu> > On Feb 10, 2017, at 11:49 AM, Matthias Vallentin wrote: > > Concretely: can you describe (without Bro script code) what "client-side > load-balancing and failover" means? Who is the client and what state > needs to be resilient to failure? I don't think we have a working > definition of "data node" either. My hunch is that they are involved in > MapReduce computation and perhaps represent the reducers, but I'm not > sure. > > Matthias Yes.. exactly like reducers. In this case, the clients are the workers and the servers are the manager/logger/datanode I want to send events containing data up to data nodes so they can be aggregated, but I don't want the data node to be a single point of failure or bottleneck. scan detection doesn't require coordination. The data just needs to be partitioned by source address. This also applies for: * Known hosts (partition on host) * Known services (partition on host or host+service) * Known certs (partition on cert hash) * Intel (partition on seen value) * Notices (partition on identifier) * DHCP (partition on mac address) as far as state, the data nodes COULD replicate their state to the other data nodes, but that's a whole separate issue. Initially the goal would just to be able to fail over from one data node to the next in the case of an outage. State on that data note would be lost if it wasn't replicated, but new work would be able to be performed instead of the system grinding to a halt. -- - Justin Azoff From johanna at icir.org Fri Feb 10 10:03:50 2017 From: johanna at icir.org (Johanna Amann) Date: Fri, 10 Feb 2017 10:03:50 -0800 Subject: [Bro-Dev] Splitting up init-bare? In-Reply-To: References: Message-ID: <20170210180350.w4wnuflg2qknomwj@wifi107.sys.ICSI.Berkeley.EDU> On Fri, Feb 10, 2017 at 11:51:08AM -0600, Vlad Grigorescu wrote: > What do people think about splitting up portions of init-bare into separate > files, and having init-bare simply @load those files? Right now, it's a > 4500+ line script that keeps growing, and it commonly results in conflicts. > > For the protocols, I could see having a file such as > protocols/kerberos/bare.bro which defines the appropriate types which are > currently in init-bare. That sounds like a good idea - I am not a big fan of the fact that a lot of the protocol dependent datatypes are in init-bare currently. I am just not sure if it might require a lot of fiddling to get the load order of things correct; but assuming that every protocol has a bare.bro (or perhaps a datatypes.bro or similar?), that should not be a huge issue. Johanna From robin at icir.org Fri Feb 10 19:13:52 2017 From: robin at icir.org (Robin Sommer) Date: Fri, 10 Feb 2017 19:13:52 -0800 Subject: [Bro-Dev] Splitting up init-bare? In-Reply-To: References: Message-ID: <20170211031352.GE1000@icir.org> On Fri, Feb 10, 2017 at 11:51 -0600, you wrote: > What do people think about splitting up portions of init-bare into separate > files Yeah, I can see that. It would be nice, though, if init-bare.bro wouldn't need lots of @load statements then to refer to the individual files. Maybe we could add some automatic way instead, like calling the files __bare__.bro and have Bro find them automatically (but to Johanna's point, not sure if that would cause load-order problems). Robin -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From seth at icir.org Sat Feb 11 21:20:04 2017 From: seth at icir.org (Seth Hall) Date: Sun, 12 Feb 2017 00:20:04 -0500 Subject: [Bro-Dev] Splitting up init-bare? In-Reply-To: <20170210180350.w4wnuflg2qknomwj@wifi107.sys.ICSI.Berkeley.EDU> References: <20170210180350.w4wnuflg2qknomwj@wifi107.sys.ICSI.Berkeley.EDU> Message-ID: <237D333C-7099-4013-86F7-A71271D15F4F@icir.org> > On Feb 10, 2017, at 1:03 PM, Johanna Amann wrote: > >> For the protocols, I could see having a file such as >> protocols/kerberos/bare.bro which defines the appropriate types which are >> currently in init-bare. > > That sounds like a good idea - I am not a big fan of the fact that a lot > of the protocol dependent datatypes are in init-bare currently. If we started structuring the analyzers internally more like external plugins with the scripts and everything in them, it would feel more comfortable to me. It seems like we'd be able to keep all of a protocol ephemera tied closely with it. Would that work? I know that internal and external plugins have some differences, but I don't know if that means we're limited in a bit in how we handle script land required data structures for analyzers. .Seth -- Seth Hall International Computer Science Institute (Bro) because everyone has a network http://www.bro.org/ From robin at icir.org Sun Feb 12 11:25:42 2017 From: robin at icir.org (Robin Sommer) Date: Sun, 12 Feb 2017 11:25:42 -0800 Subject: [Bro-Dev] Splitting up init-bare? In-Reply-To: <237D333C-7099-4013-86F7-A71271D15F4F@icir.org> References: <20170210180350.w4wnuflg2qknomwj@wifi107.sys.ICSI.Berkeley.EDU> <237D333C-7099-4013-86F7-A71271D15F4F@icir.org> Message-ID: <20170212192542.GA2240@icir.org> On Sun, Feb 12, 2017 at 00:20 -0500, you wrote: > Would that work? I know that internal and external plugins have some > differences, but I don't know if that means we're limited in a bit in > how we handle script land required data structures for analyzers. For init-bare-style initialization code I was thinking the same, and that's also partially where my __bare__.bro idea came from (actually __init__.bro would be nicer I'm thinking now). I went back to the plugin structure to see if we have the right mechanism there already, but they work slightly different in terms of when required data structures get initialized. But we could make those __init__.bro scripts work in either case I think. For a more general reorganization of moving scripts and code together, I'm still torn on that. I like that in theory, but haven't convinced myself yet that I'd like it in practice. :) Robin -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From sbaker.inbox at gmail.com Tue Feb 21 13:18:33 2017 From: sbaker.inbox at gmail.com (Stephen Baker) Date: Tue, 21 Feb 2017 15:18:33 -0600 Subject: [Bro-Dev] Bro HTTP midstream inspection Message-ID: All, When running Bro, I see a lot of midstream sessions due to long lived TCP connections that have connected before starting Bro. The Bro conn state is correct "OTH" but I would like to inspect the streams that are in progress. Is there a recommended way to process midstream TCP with Bro? For a test I modified HTTP_Analyzer::DeliverStream to allow midsteam inspection. if ( TCP() && TCP()->IsPartial() ) - return; - + { + if ( allow_midstream_pickup ) + { + Weird("Processing in midstream_client_HTTP_data"); + } + else + { + return; + } + } Is there any issues with a change similar to this for HTTP? I would expect that not all HTTP logs would be properly filled out for a connection that was already established and possible some weird log entries about the http headers. The changes does allow the logging of HTTP transactions on existing TCP connection with no issues so far doing testing. I just want to make sure that a better way to deal with existing connections or reasons why Bro should not look at HTTP in midstream. Thanks, Stephen -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.icsi.berkeley.edu/pipermail/bro-dev/attachments/20170221/807891a9/attachment.html From robin at icir.org Thu Feb 23 10:28:21 2017 From: robin at icir.org (Robin Sommer) Date: Thu, 23 Feb 2017 10:28:21 -0800 Subject: [Bro-Dev] Going to delete old branches Message-ID: <20170223182821.GD52561@icir.org> I'm going to delete a bunch of old branches that are fully merged into master. Below is the list for reference, please shout if you see system that shouldn't be deleted ... Robin %%%%%% bro topic/dnthayer/doc-fixes-updates topic/dnthayer/local-logger topic/dnthayer/ticket1472 topic/dnthayer/ticket1516 topic/dnthayer/ticket1616 topic/dnthayer/ticket1690 topic/dnthayer/ticket1719 topic/dnthayer/ticket1720 topic/dnthayer/ticket1731 topic/dnthayer/ticket1750 topic/dnthayer/ticket1757 topic/dnthayer/ticket1788 topic/jazoff/bit-1649 topic/jazoff/ticket-1670 topic/johanna/bit-1181 topic/johanna/bit-1325 topic/johanna/bit-1578 topic/johanna/bit-1612 topic/johanna/bit-1619 topic/johanna/bit-1644 topic/johanna/bit-1651 topic/johanna/bit-1683 topic/johanna/bit-1691 topic/johanna/component-initialization-order topic/johanna/freebsd-clang topic/johanna/gcc-6.2.1 topic/johanna/l2flip topic/johanna/leaks topic/johanna/no-xml topic/johanna/ocsp-validate-fix topic/johanna/rawleak topic/johanna/remove-z topic/johanna/rule-reasons topic/johanna/tls13 topic/johanna/version topic/johanna/windows-newlines topic/johanna/xmpp-ns topic/jsiwek/bit-1785 topic/robin/bit-1612-merge topic/robin/bit-1641 topic/robin/bit-1654 topic/robin/broxygen-plugin-warnings topic/robin/file-analysis-fixes topic/robin/sig-fixes topic/seth/BIT-1480 topic/seth/krb5-ticket-tracking-merge topic/seth/radius-script-refactor topic/vladg/bit-1641 topic/vladg/bit-1671 topic/vladg/krb5-ticket-tracking %%%%%% bro/master/aux/binpac topic/jsiwek/bit-1343 topic/jsiwek/bit-1361 topic/jsiwek/snmp topic/seth/compiler-cleanup topic/vladg/case_fallthrough %%%%%% bro/master/aux/binpac/cmake topic/dnthayer/ticket1516 topic/dnthayer/ticket1733 topic/dnthayer/ticket1734 topic/jsiwek/bif-loader-scripts topic/jsiwek/broker topic/jsiwek/homebrew-openssl topic/jsiwek/jemalloc topic/robin/dynamic-plugins-2.3 topic/robin/pktsrc topic/robin/plugin-updates topic/robin/reader-writer-plugins topic/vladg/homebrew-openssl %%%%%% bro/master/aux/bro-aux topic/dnthayer/ticket1209 topic/dnthayer/ticket1215 topic/dnthayer/ticket1621 topic/dnthayer/ticket856 topic/jazoff/ticket1436 topic/robin/dynamic-plugins topic/robin/dynamic-plugins-2.3 topic/robin/pktsrc topic/robin/plugin-updates topic/robin/reader-writer-plugins %%%%%% bro/master/aux/bro-aux/cmake topic/dnthayer/ticket1516 topic/dnthayer/ticket1733 topic/dnthayer/ticket1734 topic/jsiwek/bif-loader-scripts topic/jsiwek/broker topic/jsiwek/homebrew-openssl topic/jsiwek/jemalloc topic/robin/dynamic-plugins-2.3 topic/robin/pktsrc topic/robin/plugin-updates topic/robin/reader-writer-plugins topic/vladg/homebrew-openssl %%%%%% bro/master/aux/broccoli topic/dnthayer/ticket1557 topic/jsiwek/broccoli-vectors topic/jsiwek/broxygen topic/jsiwek/misc-fixes topic/jsiwek/remove-val-attribs %%%%%% bro/master/aux/broccoli/bindings/broccoli-python topic/dnthayer/ticket1711 %%%%%% bro/master/aux/broccoli/bindings/broccoli-python/cmake topic/dnthayer/ticket1516 topic/dnthayer/ticket1733 topic/dnthayer/ticket1734 topic/jsiwek/bif-loader-scripts topic/jsiwek/broker topic/jsiwek/homebrew-openssl topic/jsiwek/jemalloc topic/robin/dynamic-plugins-2.3 topic/robin/pktsrc topic/robin/plugin-updates topic/robin/reader-writer-plugins topic/vladg/homebrew-openssl %%%%%% bro/master/aux/broccoli/bindings/broccoli-ruby %%%%%% bro/master/aux/broccoli/bindings/broccoli-ruby/cmake topic/dnthayer/ticket1516 topic/dnthayer/ticket1733 topic/dnthayer/ticket1734 topic/jsiwek/bif-loader-scripts topic/jsiwek/broker topic/jsiwek/homebrew-openssl topic/jsiwek/jemalloc topic/robin/dynamic-plugins-2.3 topic/robin/pktsrc topic/robin/plugin-updates topic/robin/reader-writer-plugins topic/vladg/homebrew-openssl %%%%%% bro/master/aux/broccoli/cmake topic/dnthayer/ticket1516 topic/dnthayer/ticket1733 topic/dnthayer/ticket1734 topic/jsiwek/bif-loader-scripts topic/jsiwek/broker topic/jsiwek/homebrew-openssl topic/jsiwek/jemalloc topic/robin/dynamic-plugins-2.3 topic/robin/pktsrc topic/robin/plugin-updates topic/robin/reader-writer-plugins topic/vladg/homebrew-openssl %%%%%% bro/master/aux/broctl topic/dnthayer/doc-fixes-2.5 topic/dnthayer/local-logger topic/dnthayer/ticket1516 topic/dnthayer/ticket1676 topic/dnthayer/ticket1677 topic/dnthayer/ticket1682 topic/dnthayer/ticket1694 topic/dnthayer/ticket1726 topic/dnthayer/ticket1742 topic/dnthayer/ticket1772 topic/dnthayer/ticket1778 %%%%%% bro/master/aux/broctl/aux/capstats topic/dnthayer/ticket1774 %%%%%% bro/master/aux/broctl/aux/capstats/cmake topic/dnthayer/ticket1516 topic/dnthayer/ticket1733 topic/dnthayer/ticket1734 topic/jsiwek/bif-loader-scripts topic/jsiwek/broker topic/jsiwek/homebrew-openssl topic/jsiwek/jemalloc topic/robin/dynamic-plugins-2.3 topic/robin/pktsrc topic/robin/plugin-updates topic/robin/reader-writer-plugins topic/vladg/homebrew-openssl %%%%%% bro/master/aux/broctl/aux/pysubnettree topic/dnthayer/ticket1303 topic/dnthayer/ticket1516 topic/dnthayer/ticket1710 topic/jsiwek/fix-ipv6 topic/python3-compat %%%%%% bro/master/aux/broctl/aux/pysubnettree/cmake topic/dnthayer/ticket1516 topic/dnthayer/ticket1733 topic/dnthayer/ticket1734 tpic/jsiwek/bif-loader-scripts topic/jsiwek/broker topic/jsiwek/homebrew-openssl topic/jsiwek/jemalloc topic/robin/dynamic-plugins-2.3 topic/robin/pktsrc topic/robin/plugin-updates topic/robin/reader-writer-plugins topic/vladg/homebrew-openssl %%%%%% bro/master/aux/broctl/aux/trace-summary topic/dnthayer/ticket1297 topic/dnthayer/ticket1304 topic/dnthayer/ticket1571 topic/dnthayer/ticket1724 topic/dnthayer/ticket1730 topic/dnthayer/ticket1749 topic/dnthayer/ticket856 %%%%%% bro/master/aux/broctl/aux/trace-summary/cmake topic/dnthayer/ticket1516 topic/dnthayer/ticket1733 topic/dnthayer/ticket1734 topic/jsiwek/bif-loader-scripts topic/jsiwek/broker topic/jsiwek/homebrew-openssl topic/jsiwek/jemalloc topic/robin/dynamic-plugins-2.3 topic/robin/pktsrc topic/robin/plugin-updates topic/robin/reader-writer-plugins topic/vladg/homebrew-openssl %%%%%% bro/master/aux/broctl/cmake topic/dnthayer/ticket1516 topic/dnthayer/ticket1733 topic/dnthayer/ticket1734 topic/jsiwek/bif-loader-scripts topic/jsiwek/broker topic/jsiwek/homebrew-openssl topic/jsiwek/jemalloc topic/robin/dynamic-plugins-2.3 topic/robin/pktsrc topic/robin/plugin-updates topic/robin/reader-writer-plugins topic/vladg/homebrew-openssl %%%%%% bro/master/aux/broker topic/python3-fix %%%%%% bro/master/aux/broker/cmake topic/dnthayer/ticket1516 topic/dnthayer/ticket1733 topic/dnthayer/ticket1734 topic/jsiwek/bif-loader-scripts topic/jsiwek/broker topic/jsiwek/homebrew-openssl topic/jsiwek/jemalloc topic/robin/dynamic-plugins-2.3 topic/robin/pktsrc topic/robin/plugin-updates topic/robin/reader-writer-plugins topic/vladg/homebrew-openssl %%%%%% bro/master/aux/btest topic/dnthayer/max-lines topic/dnthayer/mktemp topic/dnthayer/py3-compat topic/dnthayer/ticket1322 topic/dnthayer/ticket1722 topic/dnthayer/ticket1750 topic/dnthayer/ticket862 topic/robin/timing %%%%%% bro/master/aux/netcontrol-connectors %%%%%% bro/master/aux/plugins topic/dnthayer/doc-improvements-2.4 topic/dnthayer/fix-typos topic/dnthayer/ticket1536 topic/johanna/postgres topic/robin/netmap topic/robin/plugin-updates topic/robin/rework-packets-merge topic/robin/tcprs-merge-again topic/vladg/es-fixes %%%%%% bro/master/cmake topic/dnthayer/ticket1516 topic/dnthayer/ticket1733 topic/dnthayer/ticket1734 topic/jsiwek/bif-loader-scripts topic/jsiwek/broker topic/jsiwek/homebrew-openssl topic/jsiwek/jemalloc topic/robin/dynamic-plugins-2.3 topic/robin/pktsrc topic/robin/plugin-updates topic/robin/reader-writer-plugins topic/vladg/homebrew-openssl %%%%%% bro/master/src/3rdparty topic/jsiwek/file-signatures topic/jsiwek/libmagic-integration topic/jsiwek/new-libmagic -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin