From vallentin at icir.org Mon Jan 2 11:46:27 2017 From: vallentin at icir.org (Matthias Vallentin) Date: Mon, 2 Jan 2017 11:46:27 -0800 Subject: [Bro-Dev] [Proposal] Language extensions for better Broker support In-Reply-To: <20161216175305.GA63829@icir.org> References: <70454DF2-AF93-434D-B78D-0E7080B4B40D@illinois.edu> <20161213185255.GJ940@icir.org> <20161213195139.GG26945@ninja.local> <20161214152728.GB44048@icir.org> <20161216010137.GE14434@icir.org> <20161216154654.GE31505@icir.org> <20161216175305.GA63829@icir.org> Message-ID: <20170102194627.GE2037@ninja.local> > Alternatively, we could leave it to frameworks to define their own > error types. So for Broker, we'd have Broker::NotFound, > Broker::Timeout, etc. And the opaque types would define internally > what they convert to, and how. It looks like this is the model you went with in the revised proposal. For better modularity, that's also the model I'd prefer. > > if ( status(v) == Broker::SUCCESS ) > > Thinking more about this, I kind of like this version actually, and > have for now included that into the proposal. Curious to hear what > others think about this. It would be an easy solution actually. I'm not sure if it's a typo or not, but your code example has a new, so-far not discussed form of type castin: T to x. For example: case bool to b: # Make the boolean available as "b". print "bool", b; break; That would mean we'd have new keywords: is, to, as. I find that's too confusing and think we should go with either "to" or "as" but not both. Matthias From vallentin at icir.org Mon Jan 2 12:00:36 2017 From: vallentin at icir.org (Matthias Vallentin) Date: Mon, 2 Jan 2017 12:00:36 -0800 Subject: [Bro-Dev] [Proposal] Language extensions for better Broker support In-Reply-To: References: <694E3B3D-5216-4EF9-8B4F-7FD00864FBC8@illinois.edu> <20161206111705.GE90069@ninja.local> <20161208164919.GX60813@icir.org> <20161211114906.GN26945@ninja.local> <70454DF2-AF93-434D-B78D-0E7080B4B40D@illinois.edu> <20161213120223.GZ26945@ninja.local> <4C73FBE9-1143-42D6-A17B-9FC483248A83@illinois.edu> <20161213174210.GD26945@ninja.local> Message-ID: <20170102200036.GF2037@ninja.local> On Wed, Dec 14, 2016 at 04:17:26PM +0000, Siwek, Jon wrote: > > > On Dec 13, 2016, at 11:42 AM, Matthias Vallentin wrote: > > > >>> local r = put(store, key, test(lookup(store, key))); > > > > It's up to the user to check the result variable (here: r) and decide > > what to do: abort, retry, continue, or report an error. > > The thing that got me about that for this particular example was that > I can?t distinguish whether the "lookup" or the ?put? failed, which > might be important since the ?test? operation is between them and I > may or may not want ?test" to happen in the retry attempt depending on > what exactly failed. Yes, in this case that's not possible. Though nothing speaks against tearing the expression apart if you need to that distinction. It might not have been the best example, but I wanted to illustrate the semantics of composition of asynchronous functions. Overall, Bro is an asynchronous language with imperative building blocks. Especially with this inherent (and now increasing) asynchrony, I think it's important to look at established concepts from functional paradigms that demonstrate the utility of composable primitives. Matthias From vallentin at icir.org Mon Jan 2 13:07:21 2017 From: vallentin at icir.org (Matthias Vallentin) Date: Mon, 2 Jan 2017 13:07:21 -0800 Subject: [Bro-Dev] Rethinking Broker's blocking API Message-ID: <20170102210721.GG2037@ninja.local> Broker's current API to receive messages is as follows: context ctx; auto ep = ctx.spawn(); ep.receive([&](const topic& t, const data& x) { .. }); ep.receive([&](const status& s) { .. }); or the last two in one call: ep.receive( [&](const topic& t, const data& x) { .. }, [&](const status& s) { .. } ); The idea behind this API is that it's similar to the non-blocking endpoint API: auto ep = ctx.spawn(); ep.subscribe([=](const topic& t, const data& x) { .. }); ep.subscribe([=](const status& s) { .. }); Non-blocking endpoints should be the default, because they are more efficient due to the absence of blocking. For simplicity, the current API also provides a non-lambda overload of receive: auto ep = ctx.spawn(); auto msg = ep.receive(); std::cout << msg.topic() << " -> " << msg.data() << std::endl; Users can also check the mailbox of the blocking endpoint whether it contains a message: // Only block if know that we have a message. if (!ep.mailbox().empty()) auto msg = ep.receive(); What I haven't considered up to now is the interaction of data and status messages in the blocking API. Both broker::message and broker::status are messages that linger the endpoint's mailbox. I find the terminology confusing, because a status instance is technically also a message. I'd rather speak of "data messages" and "status messages" as opposed to "messages" and "statuses". But more about the terminology later. There's a problem with the snippet above. If the mailbox is non-empty because it contains a status message, the following call to receive() would hang, because it expects a data message. The only safe solution would be to use this form: if (!ep.mailbox().empty()) ep.receive( [&](const topic& t, const data& x) { .. }, [&](const status& s) { .. } ); The problem lies in the receive() function that returns a message. It doesn't match the current architecture (a blocking endpoint has a single mailbox) and is not a safe API for users. Here are some solutions I could think of: (1) Let receive() return a variant instead, because the caller cannot know a priori what to expect. While simple to call, it burdens the user with type-based dispatching afterwards. (2) Specify the type of message a user wants to receive, e.g., auto x = ep.receive(); auto y = ep.receive(); Here, don't like the terminology issues I mentioned above. More reasonable could be auto x = ep.receive(); auto y = ep.receive(); where x could have type data_message with .topic() and .data(), and y be a direct instance of type status. But because callers don't know whether they'll receive a status or data message, this solution is only an incremental improvement. (3) Equip blocking endpoints with separate mailboxes for data and status messages. In combination with (2), this could lead to something like: if (!ep.mailbox().empty()) auto msg = ep.receive(); if (!ep.mailbox().empty()) auto s = ep.receive(); But now users have to keep track of two mailboxes, which is more error-prone and verbose. (4) Drop status messages unless the user explicitly asks for them. Do not consider them when working with an endpoint's mailbox, which only covers data messages. While keeping the semantics of ep.receive() simple, it's not clear how to poll for status messages. Should they just accumulate in the endpoint and be queryable? E.g.,: // Bounded? Infinite? const std::vector& xs = ep.statuses(); Ultimately, I think it's important to have consistent API of blocking and non-blocking endpoints. Any thoughts on how to move forwards would be appreciated. Matthias From jan.grashoefer at gmail.com Tue Jan 3 01:21:14 2017 From: jan.grashoefer at gmail.com (=?UTF-8?Q?Jan_Grash=c3=b6fer?=) Date: Tue, 3 Jan 2017 10:21:14 +0100 Subject: [Bro-Dev] [Proposal] Language extensions for better Broker support In-Reply-To: <20161216154654.GE31505@icir.org> References: <20161213120223.GZ26945@ninja.local> <20161213102811.GY26945@ninja.local> <70454DF2-AF93-434D-B78D-0E7080B4B40D@illinois.edu> <20161213185255.GJ940@icir.org> <20161213195139.GG26945@ninja.local> <20161214152728.GB44048@icir.org> <20161216010137.GE14434@icir.org> <20161216154654.GE31505@icir.org> Message-ID: > We could also make the two different return values explicit: > > [result, value] = Broker::lookup(h, 42) # Returns [Broker::Result, opaque of Broker::Data] > > if ( result == Broker::SUCCESS ) ... I would prefer this solution, as it feels more natural coming from other languages like python. Introducing new keywords/magic functions like status() might be conceptually elegant from the perspective of the Bro language itself but makes it more difficult to learn the language respectively understand code. Jan From vallentin at icir.org Tue Jan 3 07:53:21 2017 From: vallentin at icir.org (Matthias Vallentin) Date: Tue, 3 Jan 2017 07:53:21 -0800 Subject: [Bro-Dev] [Proposal] Language extensions for better Broker support In-Reply-To: References: <70454DF2-AF93-434D-B78D-0E7080B4B40D@illinois.edu> <20161213185255.GJ940@icir.org> <20161213195139.GG26945@ninja.local> <20161214152728.GB44048@icir.org> <20161216010137.GE14434@icir.org> <20161216154654.GE31505@icir.org> Message-ID: <20170103155321.GB42337@ninja.local> > > [result, value] = Broker::lookup(h, 42) # Returns [Broker::Result, opaque of Broker::Data] > > > > if ( result == Broker::SUCCESS ) ... > > I would prefer this solution, as it feels more natural coming from other > languages like python. I like it as well because there's no call to status() and the concrete types of the return variables are irrelevant to the user. (Not only scripting languages but also C++17 finally introduces "structured bindings" to support this form of local tuple binding.) The key question is whether it will only be a one-off or fits more broadly into the language as part of first-class tuple support. Matthias From jan.grashoefer at gmail.com Tue Jan 3 08:22:32 2017 From: jan.grashoefer at gmail.com (=?UTF-8?Q?Jan_Grash=c3=b6fer?=) Date: Tue, 3 Jan 2017 17:22:32 +0100 Subject: [Bro-Dev] [Proposal] Language extensions for better Broker support In-Reply-To: <20170103155321.GB42337@ninja.local> References: <70454DF2-AF93-434D-B78D-0E7080B4B40D@illinois.edu> <20161213185255.GJ940@icir.org> <20161213195139.GG26945@ninja.local> <20161214152728.GB44048@icir.org> <20161216010137.GE14434@icir.org> <20161216154654.GE31505@icir.org> <20170103155321.GB42337@ninja.local> Message-ID: <7e0a804a-d0d2-a6e2-7caa-c47eb9bc999a@gmail.com> >>> [result, value] = Broker::lookup(h, 42) # Returns [Broker::Result, opaque of Broker::Data] >>> >>> if ( result == Broker::SUCCESS ) ... >> >> I would prefer this solution, as it feels more natural coming from other >> languages like python. > > The key question is whether it will only be a one-off or fits more > broadly into the language as part of first-class tuple support. Something similar is already available in context of expiration callbacks for tables using multiple indices (see e.g., https://github.com/bro/bro/blob/master/scripts/base/frameworks/intel/main.bro#L258). Introducing tuples would probably allow to get rid of using the any type at this point. Maybe there are also some scripts, whose readability can be improved by using tuples. Jan From robin at icir.org Tue Jan 3 08:47:46 2017 From: robin at icir.org (Robin Sommer) Date: Tue, 3 Jan 2017 08:47:46 -0800 Subject: [Bro-Dev] [Proposal] Language extensions for better Broker support In-Reply-To: <20170102194627.GE2037@ninja.local> References: <20161213185255.GJ940@icir.org> <20161213195139.GG26945@ninja.local> <20161214152728.GB44048@icir.org> <20161216010137.GE14434@icir.org> <20161216154654.GE31505@icir.org> <20161216175305.GA63829@icir.org> <20170102194627.GE2037@ninja.local> Message-ID: <20170103164746.GE20377@icir.org> On Mon, Jan 02, 2017 at 11:46 -0800, you wrote: > That would mean we'd have new keywords: is, to, as. I find that's too > confusing and think we should go with either "to" or "as" but not both. Yeah, I agree, don't like that version anymore either. I have just committed a first implementation of the type-based switch that uses this syntax instead: case type string: .... case type count as c: .... What do you think of that? The additional "type" makes it visually clear what's it's about, and also helps the parser figure it out. Robin -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From robin at icir.org Tue Jan 3 08:50:36 2017 From: robin at icir.org (Robin Sommer) Date: Tue, 3 Jan 2017 08:50:36 -0800 Subject: [Bro-Dev] [Proposal] Language extensions for better Broker support In-Reply-To: References: <70454DF2-AF93-434D-B78D-0E7080B4B40D@illinois.edu> <20161213185255.GJ940@icir.org> <20161213195139.GG26945@ninja.local> <20161214152728.GB44048@icir.org> <20161216010137.GE14434@icir.org> <20161216154654.GE31505@icir.org> Message-ID: <20170103165036.GF20377@icir.org> On Tue, Jan 03, 2017 at 10:21 +0100, you wrote: > > [result, value] = Broker::lookup(h, 42) # Returns [Broker::Result, opaque of Broker::Data] > > if ( result == Broker::SUCCESS ) ... > I would prefer this solution, as it feels more natural coming from other > languages like python. Introducing new keywords/magic functions like > status() Actually it's the opposite: status() wouldn't need anything new, that's part of the appeal there. It would just be a bif (although probably called Broker::status()) that takes the Broker opaque and query its most recent state. Robin -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From robin at icir.org Tue Jan 3 08:55:04 2017 From: robin at icir.org (Robin Sommer) Date: Tue, 3 Jan 2017 08:55:04 -0800 Subject: [Bro-Dev] [Proposal] Language extensions for better Broker support In-Reply-To: <7e0a804a-d0d2-a6e2-7caa-c47eb9bc999a@gmail.com> References: <20161213195139.GG26945@ninja.local> <20161214152728.GB44048@icir.org> <20161216010137.GE14434@icir.org> <20161216154654.GE31505@icir.org> <20170103155321.GB42337@ninja.local> <7e0a804a-d0d2-a6e2-7caa-c47eb9bc999a@gmail.com> Message-ID: <20170103165504.GG20377@icir.org> On Tue, Jan 03, 2017 at 17:22 +0100, you wrote: > Something similar is already available in context of expiration > callbacks for tables using multiple indices Yeah, although that's indeed a one-off that would be hard to avoid doing for more cases (for-loop over tables is another one). I agree that if we had full tuple support, that would be the way to go here; and it would generally be a very nice extension of the language. However, introducing tuples is a major piece by itself, and I'm reluctant to have the Broker changes depend on that. We could go the Broker::status() route for now and switch over to tuples later if/when we get them ... Robin -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From vallentin at icir.org Tue Jan 3 10:32:04 2017 From: vallentin at icir.org (Matthias Vallentin) Date: Tue, 3 Jan 2017 10:32:04 -0800 Subject: [Bro-Dev] [Proposal] Language extensions for better Broker support In-Reply-To: <20170103164746.GE20377@icir.org> References: <20161213195139.GG26945@ninja.local> <20161214152728.GB44048@icir.org> <20161216010137.GE14434@icir.org> <20161216154654.GE31505@icir.org> <20161216175305.GA63829@icir.org> <20170102194627.GE2037@ninja.local> <20170103164746.GE20377@icir.org> Message-ID: <20170103183204.GD42337@ninja.local> > Yeah, I agree, don't like that version anymore either. Ok, good. :-) > case type count as c: > .... > > > What do you think of that? The additional "type" makes it visually > clear what's it's about, and also helps the parser figure it out. I find that there's too much going on in a single line now. With the extra "type" keyword, my mind gets stuck figuring out precedence rules. What I haven't considered until now is that in the current proposal, "as" can occur in two forms: local x = async Broker::lookup(..); local c = x as count; and case type count as c: It would be great if LHS and RHS are consistent, ideally LHS always an identifier and RHS a type, which reads most naturally. For "switch" that would imply: case c as count: Is that any easier for the parser? Matthias From jan.grashoefer at gmail.com Tue Jan 3 14:25:50 2017 From: jan.grashoefer at gmail.com (=?UTF-8?Q?Jan_Grash=c3=b6fer?=) Date: Tue, 3 Jan 2017 23:25:50 +0100 Subject: [Bro-Dev] [Proposal] Language extensions for better Broker support In-Reply-To: <20170103183204.GD42337@ninja.local> References: <20161213195139.GG26945@ninja.local> <20161214152728.GB44048@icir.org> <20161216010137.GE14434@icir.org> <20161216154654.GE31505@icir.org> <20161216175305.GA63829@icir.org> <20170102194627.GE2037@ninja.local> <20170103164746.GE20377@icir.org> <20170103183204.GD42337@ninja.local> Message-ID: <5588f1fc-713c-987e-2bc8-2f30f1d10825@gmail.com> >> case type count as c: >> .... > > I find that there's too much going on in a single line now. With the > extra "type" keyword, my mind gets stuck figuring out precedence rules. I agree, its hard to read and too much to type from my perspective. > For "switch" that would imply: > > case c as count: > > Is that any easier for the parser? If not, would just type count as c: do the job? Although it mixes LHS and RHS again, it seems intuitive to me. Finally it would be a tradeoff between ordering consistency and a third keyword (like "->" or "in"). Jan From jan.grashoefer at gmail.com Tue Jan 3 14:41:38 2017 From: jan.grashoefer at gmail.com (=?UTF-8?Q?Jan_Grash=c3=b6fer?=) Date: Tue, 3 Jan 2017 23:41:38 +0100 Subject: [Bro-Dev] [Proposal] Language extensions for better Broker support In-Reply-To: <20170103165504.GG20377@icir.org> References: <20161213195139.GG26945@ninja.local> <20161214152728.GB44048@icir.org> <20161216010137.GE14434@icir.org> <20161216154654.GE31505@icir.org> <20170103155321.GB42337@ninja.local> <7e0a804a-d0d2-a6e2-7caa-c47eb9bc999a@gmail.com> <20170103165504.GG20377@icir.org> Message-ID: <31fcd5e6-ecc2-eeeb-7385-3bddf864ea18@gmail.com> > However, introducing tuples is a major piece by itself, and I'm > reluctant to have the Broker changes depend on that. > > We could go the Broker::status() route for now and switch over to > tuples later if/when we get them ... Seems reasonable to me. However, I thought adding tuples would be relatively straight forward, given that there are already some similar structures in use. Is there a major show-stopper I am missing or is it just the fact that tuple support would be a new/untested language feature? Jan From robin at icir.org Tue Jan 3 15:59:36 2017 From: robin at icir.org (Robin Sommer) Date: Tue, 3 Jan 2017 15:59:36 -0800 Subject: [Bro-Dev] [Proposal] Language extensions for better Broker support In-Reply-To: <31fcd5e6-ecc2-eeeb-7385-3bddf864ea18@gmail.com> References: <20161214152728.GB44048@icir.org> <20161216010137.GE14434@icir.org> <20161216154654.GE31505@icir.org> <20170103155321.GB42337@ninja.local> <7e0a804a-d0d2-a6e2-7caa-c47eb9bc999a@gmail.com> <20170103165504.GG20377@icir.org> <31fcd5e6-ecc2-eeeb-7385-3bddf864ea18@gmail.com> Message-ID: <20170103235936.GD887@icir.org> On Tue, Jan 03, 2017 at 23:41 +0100, you wrote: > Is there a major show-stopper I am missing or is it just the fact that > tuple support would be a new/untested language feature? We'd need to hook them into a number of places across the interpreter. For example there are various locations where types coerce automatically on assignment & parameter passing; tuples would need to do the right thing there (and recursively). Also, the current ad-hoc tuple-like constructs would need to be adapted, and maybe some more similar constructs added elsewhere for added flexibility. It's not rocket science, it just needs somebody to take it on as a separate project I think. Robin -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From vallentin at icir.org Wed Jan 4 10:20:01 2017 From: vallentin at icir.org (Matthias Vallentin) Date: Wed, 4 Jan 2017 10:20:01 -0800 Subject: [Bro-Dev] [Bro-Commits] [git/bro] topic/robin/dynamic-cast: Add experimental "is" and "as" operators. (dabe125) In-Reply-To: <201701031633.v03GXq7S017723@bro-ids.icir.org> References: <201701031633.v03GXq7S017723@bro-ids.icir.org> Message-ID: <20170104182001.GB543@ninja.local> > function check(a: any) > { > local s: string = "default"; > > if ( a is string ) > s = (a as string); Are the parenthesis around the expression required? Intuitively, operator "as" should have higher precedence. Matthias From robin at icir.org Wed Jan 4 12:48:50 2017 From: robin at icir.org (Robin Sommer) Date: Wed, 4 Jan 2017 12:48:50 -0800 Subject: [Bro-Dev] Rethinking Broker's blocking API In-Reply-To: <20170102210721.GG2037@ninja.local> References: <20170102210721.GG2037@ninja.local> Message-ID: <20170104204850.GA89520@icir.org> On Mon, Jan 02, 2017 at 13:07 -0800, you wrote: > Ultimately, I think it's important to have consistent API of blocking > and non-blocking endpoints. Any thoughts on how to move forwards would > be appreciated. Nice summary of the challenge! I agree that none of the options you list sound really appealing. Here's an alternative idea: could we change your option 1 (the variant) into always returning *both*, i.e., tuple? To make that work, we'd add an additional (say) GOT_MESSAGE status. Each time receive() gets called, it returns whatever's internally next on deck: either a message (with status set to GOT_MESSAGE), or a "real" status (with the tuple's message set to null). The caller would then check the status first: next = ev.receive() if ( next.status() == GOT_MESSAGE ) process_message(next.msg()); else { // Status/error handling here. } This would actually align pretty nicely with how blocking APIs normally operate: returning potential errors directly with the call. And having the client code check for errors before using the result feels natural to me. (And it's actually same approach we discussed for the Bro-side API if we had tuples there. :) What do you think? Robin > Broker's current API to receive messages is as follows: > > context ctx; > auto ep = ctx.spawn(); > ep.receive([&](const topic& t, const data& x) { .. }); > ep.receive([&](const status& s) { .. }); > > or the last two in one call: > > ep.receive( > [&](const topic& t, const data& x) { .. }, > [&](const status& s) { .. } > ); > > The idea behind this API is that it's similar to the non-blocking > endpoint API: > > auto ep = ctx.spawn(); > ep.subscribe([=](const topic& t, const data& x) { .. }); > ep.subscribe([=](const status& s) { .. }); > > Non-blocking endpoints should be the default, because they are more > efficient due to the absence of blocking. For simplicity, the current > API also provides a non-lambda overload of receive: > > auto ep = ctx.spawn(); > auto msg = ep.receive(); > std::cout << msg.topic() << " -> " << msg.data() << std::endl; > > Users can also check the mailbox of the blocking endpoint whether it > contains a message: > > // Only block if know that we have a message. > if (!ep.mailbox().empty()) > auto msg = ep.receive(); > > What I haven't considered up to now is the interaction of data and > status messages in the blocking API. Both broker::message and > broker::status are messages that linger the endpoint's mailbox. I find > the terminology confusing, because a status instance is technically also > a message. I'd rather speak of "data messages" and "status messages" as > opposed to "messages" and "statuses". But more about the terminology > later. > > There's a problem with the snippet above. If the mailbox is non-empty > because it contains a status message, the following call to receive() > would hang, because it expects a data message. The only safe solution > would be to use this form: > > if (!ep.mailbox().empty()) > ep.receive( > [&](const topic& t, const data& x) { .. }, > [&](const status& s) { .. } > ); > > The problem lies in the receive() function that returns a message. It > doesn't match the current architecture (a blocking endpoint has a single > mailbox) and is not a safe API for users. > > Here are some solutions I could think of: > > (1) Let receive() return a variant instead, because > the caller cannot know a priori what to expect. While simple to > call, it burdens the user with type-based dispatching afterwards. > > (2) Specify the type of message a user wants to receive, e.g., > > auto x = ep.receive(); > auto y = ep.receive(); > > Here, don't like the terminology issues I mentioned above. More > reasonable could be > > auto x = ep.receive(); > auto y = ep.receive(); > > where x could have type data_message with .topic() and .data(), > and y be a direct instance of type status. > > But because callers don't know whether they'll receive a status > or data message, this solution is only an incremental > improvement. > > (3) Equip blocking endpoints with separate mailboxes for data and > status messages. In combination with (2), this could lead to > something like: > > if (!ep.mailbox().empty()) > auto msg = ep.receive(); > > if (!ep.mailbox().empty()) > auto s = ep.receive(); > > But now users have to keep track of two mailboxes, which is more > error-prone and verbose. > > (4) Drop status messages unless the user explicitly asks for them. > Do not consider them when working with an endpoint's mailbox, > which only covers data messages. > > While keeping the semantics of ep.receive() simple, it's not > clear how to poll for status messages. Should they just > accumulate in the endpoint and be queryable? E.g.,: > > // Bounded? Infinite? > const std::vector& xs = ep.statuses(); > > > Matthias > _______________________________________________ > bro-dev mailing list > bro-dev at bro.org > http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev > -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From vallentin at icir.org Thu Jan 5 17:04:32 2017 From: vallentin at icir.org (Matthias Vallentin) Date: Thu, 5 Jan 2017 17:04:32 -0800 Subject: [Bro-Dev] Rethinking Broker's blocking API In-Reply-To: <20170104204850.GA89520@icir.org> References: <20170102210721.GG2037@ninja.local> <20170104204850.GA89520@icir.org> Message-ID: <20170106010432.GD30604@shogun.local> > Nice summary of the challenge! I agree that none of the options you > list sound really appealing. Here's an alternative idea: could we > change your option 1 (the variant) into always returning *both*, i.e., > tuple? You pushed me into a new direction. Broker already returns expected for operations frequently (e.g., for blocking operations with data store) that yield either a T or a broker::error (which is just an alias for caf::error). How about we get rid of statuses entirely? Here are the current status enum values: enum status_info : uint8_t { unknown_status = 0, peer_added, ///< Successfully peered peer_removed, ///< Successfully unpeered peer_incompatible, ///< Version incompatibility peer_invalid, ///< Referenced peer does not exist peer_unavailable, ///< Remote peer not listening peer_lost, ///< Lost connection to peer peer_recovered, ///< Re-gained connection to peer }; And the error values: enum class ec : uint8_t { /// The unspecified default error code. unspecified = 1, /// Version mismatch during peering. version_incompatible, /// Master with given name already exist. master_exists, /// Master with given name does not exist. no_such_master, /// The given data store key does not exist. no_such_key, /// The operation expected a different type than provided type_clash, /// The data value cannot be used to carry out the desired operation. invalid_data, /// The storage backend failed to execute the operation. backend_failure, }; Clearly, this could be merged together. This would yield a natural API for receive: expected receive(); To be used as follows: auto msg = ep.receive(); if (msg) return f(*msg); // unbox contained message switch (msg.error()) { default: cout << to_string(msg.error()) << endl; break; case status::peer_added: cout << "got new peer: " << msg.context() << endl; break; case status::peer_lost:: break; } This is pretty much what you suggested, Robin, just with a slight syntactical twist. The only downside I see is that "msg.error()" could be misleading, as we're sometimes not dealing with an error on the Broker framework level, just with respect to the call to expected. That is, the error is that we didn't get a message, which we expected, but something that is a status change in the global topology, such as a new peer. Matthias From robin at icir.org Fri Jan 6 05:16:38 2017 From: robin at icir.org (Robin Sommer) Date: Fri, 6 Jan 2017 05:16:38 -0800 Subject: [Bro-Dev] Rethinking Broker's blocking API In-Reply-To: <20170106010432.GD30604@shogun.local> References: <20170102210721.GG2037@ninja.local> <20170104204850.GA89520@icir.org> <20170106010432.GD30604@shogun.local> Message-ID: <20170106131638.GK39085@icir.org> On Thu, Jan 05, 2017 at 17:04 -0800, you wrote: > expected receive(); Yeah, I like that, except for one concern: > switch (msg.error()) { > default: > cout << to_string(msg.error()) << endl; > break; > case status::peer_added: > cout << "got new peer: " << msg.context() << endl; > break; > case status::peer_lost:: > break; > } I think the name "error" is not just misleading but would also turn out tricky to use correctly. In that switch statement, if the default case is to handle only (real) errors, one would need to fully enumerate all the status:* messages so that they don't arrive there too. More generally: the distinction between errors that signal trouble with the connection and expected state changes doesn't come through here. What do you think about adding two methods instead of one to allow differentiating between status updates and errors explicitly: status() returns a joined result code like your error(); and failed() returns a boolean indicating if that status reflects an error situation: auto msg = ep.receive(); if (msg) return f(*msg); // unbox contained message if (msg.failed()) cout << "Trouble: " << to_string(msg.status()) << endl; else cout << "Status change: " << to_string(msg.status()) << endl; In either case one could then also use a switch to differentiate the status() further. Robin -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From vallentin at icir.org Fri Jan 6 10:32:50 2017 From: vallentin at icir.org (Matthias Vallentin) Date: Fri, 6 Jan 2017 10:32:50 -0800 Subject: [Bro-Dev] Rethinking Broker's blocking API In-Reply-To: <20170106131638.GK39085@icir.org> References: <20170102210721.GG2037@ninja.local> <20170104204850.GA89520@icir.org> <20170106010432.GD30604@shogun.local> <20170106131638.GK39085@icir.org> Message-ID: <20170106183250.GE30604@shogun.local> > I think the name "error" is not just misleading but would also turn > out tricky to use correctly. Agreed. > auto msg = ep.receive(); > > if (msg) > return f(*msg); // unbox contained message > > if (msg.failed()) > cout << "Trouble: " << to_string(msg.status()) << endl; > else > cout << "Status change: " << to_string(msg.status()) << endl; > > In either case one could then also use a switch to differentiate the > status() further. I think this is semantically what we want. Because broker::error is just a type alias for caf::error (and broker::expected just caf::expected), it's currently not possible to change the API of those classes. I see two solutions, one based on the existing vehicles and one that introduces a new structure with three states for T, error, and status. Here's the first. A caf::error class has three interesting member functions: class error { uint8_t code(); atom_value category(); const message& context(); }; The member category() returns the type of error. In Broker, this is always "broker". CAF errors have "caf" as category. We could simply split the "broker" error category into "broker-status" and "broker-error" to distinguish the error class. We would keep the two different enums and provide free functions for a user-friendly API. Example: // -- Usage ------------------------------------------- auto msg = ep.receive(); // expected if (msg) { f(*msg); // unbox message } else if (is_error(msg)) { cout << "error: " << to_string(msg.error()) << endl; if (msg == error::type_clash) // dispatch concrete error } else if (is_status(msg)) { cout << "status: " << to_string(msg.error()) << endl; if (msg == status::peer_added) // dispatch concrete status } else { // CAF error } // -- Broker implementation --------------------------- using failure = caf::error; enum status : uint8_t { /* define status codes * /}; enum error : uint8_t { /* define error codes */ }; bool is_error(const failure& f) { return f.category() == atom("broker-error"); } bool is_status(const failure& f) { return f.category() == atom("broker-status"); } template bool failed(const expected& x) { return !x && failed(x.error()); } template bool operator==(const expected& x, status s) { return !x && x.error() == s; } The only downside here is that we're still calling msg.error() in the case of a status. That's where the second option comes in. Let's call it result for now (it won't matter much, because most people will use it with "auto"). A result wraps an expected and provides a nicer API to distinguish errors and status more cleanly. Example: // -- Usage ------------------------------------------- auto msg = ep.receive(); // result if (msg) { f(*msg); // unbox T } else if (auto e = msg.error()) { cout << "error: " << to_string(*e) << endl; if (*e == error::type_clash) // dispatch concrete error } else if (auto s = msg.status()) { cout << "status: " << to_string(*s) << endl; if (*s == status::peer_added) // dispatch concrete status } else { assert(!"not possible"); } // -- Broker implementation --------------------------- enum status : uint8_t { /* define status codes * /}; enum error : uint8_t { /* define error codes */ }; template class result { public: optional status() const; optional error() const; // Box semantics explicit operator bool() const; T& operator*(); const T& operator*() const; T* operator->(); const T* operator->() const; private: expected x_; }; Thoughts? Matthias From robin at icir.org Fri Jan 6 13:44:19 2017 From: robin at icir.org (Robin Sommer) Date: Fri, 6 Jan 2017 13:44:19 -0800 Subject: [Bro-Dev] Rethinking Broker's blocking API In-Reply-To: <20170106183250.GE30604@shogun.local> References: <20170102210721.GG2037@ninja.local> <20170104204850.GA89520@icir.org> <20170106010432.GD30604@shogun.local> <20170106131638.GK39085@icir.org> <20170106183250.GE30604@shogun.local> Message-ID: <20170106214419.GN39085@icir.org> I'm going back and forth between the two versions. I think I'd take the 2nd (the custom class), though maybe with the API I had used in my example instead (i.e, msg.failed() and then a single status() method for both cases) as having two methods error/status returning the same thing looks a bit odd. But not a big deal either way, any of these options sounds fine to me. Robin On Fri, Jan 06, 2017 at 10:32 -0800, you wrote: > > I think the name "error" is not just misleading but would also turn > > out tricky to use correctly. > > Agreed. > > > auto msg = ep.receive(); > > > > if (msg) > > return f(*msg); // unbox contained message > > > > if (msg.failed()) > > cout << "Trouble: " << to_string(msg.status()) << endl; > > else > > cout << "Status change: " << to_string(msg.status()) << endl; > > > > In either case one could then also use a switch to differentiate the > > status() further. > > I think this is semantically what we want. Because broker::error is just > a type alias for caf::error (and broker::expected just caf::expected), > it's currently not possible to change the API of those classes. I see > two solutions, one based on the existing vehicles and one that > introduces a new structure with three states for T, error, and status. > > Here's the first. A caf::error class has three interesting member > functions: > > class error { > uint8_t code(); > atom_value category(); > const message& context(); > }; > > The member category() returns the type of error. In Broker, this is > always "broker". CAF errors have "caf" as category. We could simply > split the "broker" error category into "broker-status" and > "broker-error" to distinguish the error class. We would keep the two > different enums and provide free functions for a user-friendly API. > Example: > > // -- Usage ------------------------------------------- > > auto msg = ep.receive(); // expected > > if (msg) { > f(*msg); // unbox message > } else if (is_error(msg)) { > cout << "error: " << to_string(msg.error()) << endl; > if (msg == error::type_clash) > // dispatch concrete error > } else if (is_status(msg)) { > cout << "status: " << to_string(msg.error()) << endl; > if (msg == status::peer_added) > // dispatch concrete status > } else { > // CAF error > } > > // -- Broker implementation --------------------------- > > using failure = caf::error; > > enum status : uint8_t { /* define status codes * /}; > > enum error : uint8_t { /* define error codes */ }; > > bool is_error(const failure& f) { > return f.category() == atom("broker-error"); > } > > bool is_status(const failure& f) { > return f.category() == atom("broker-status"); > } > > template > bool failed(const expected& x) { > return !x && failed(x.error()); > } > > template > bool operator==(const expected& x, status s) { > return !x && x.error() == s; > } > > The only downside here is that we're still calling msg.error() in the > case of a status. That's where the second option comes in. Let's call it > result for now (it won't matter much, because most people will use it > with "auto"). A result wraps an expected and provides a nicer API > to distinguish errors and status more cleanly. Example: > > // -- Usage ------------------------------------------- > > auto msg = ep.receive(); // result > > if (msg) { > f(*msg); // unbox T > } else if (auto e = msg.error()) { > cout << "error: " << to_string(*e) << endl; > if (*e == error::type_clash) > // dispatch concrete error > } else if (auto s = msg.status()) { > cout << "status: " << to_string(*s) << endl; > if (*s == status::peer_added) > // dispatch concrete status > } else { > assert(!"not possible"); > } > > // -- Broker implementation --------------------------- > > enum status : uint8_t { /* define status codes * /}; > > enum error : uint8_t { /* define error codes */ }; > > template > class result { > public: > optional status() const; > optional error() const; > > // Box semantics > explicit operator bool() const; > T& operator*(); > const T& operator*() const; > T* operator->(); > const T* operator->() const; > > private: > expected x_; > }; > > Thoughts? > > Matthias > -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From vallentin at icir.org Mon Jan 9 11:34:23 2017 From: vallentin at icir.org (Matthias Vallentin) Date: Mon, 9 Jan 2017 11:34:23 -0800 Subject: [Bro-Dev] Rethinking Broker's blocking API In-Reply-To: <20170106214419.GN39085@icir.org> References: <20170102210721.GG2037@ninja.local> <20170104204850.GA89520@icir.org> <20170106010432.GD30604@shogun.local> <20170106131638.GK39085@icir.org> <20170106183250.GE30604@shogun.local> <20170106214419.GN39085@icir.org> Message-ID: <20170109193423.GL30604@shogun.local> > But not a big deal either way, any of these options sounds fine to me. This is the synopsis for the slightly adapted message class, no other changes: class message { public: /// Checks whether a message is a (topic, data) pair. /// @returns `true` iff the message contains a (topic, data) pair. explicit operator bool() const; /// @returns the contained topic. /// @pre `static_cast(*this)` const broker::topic& topic() const; /// @returns the contained topic. /// @pre `static_cast(*this)` const broker::data& data() const; /// @returns the contained status. /// @pre `!*this` const broker::status& status() const; }; I opted against a .failed() member in broker::message, because it's up to the concrete status instance to define failure or a mere status change. A message is now either a (topic, data) pair or a status instance. To distinguish between the two status, I use operator bool. Since this is almost the same as a failed() method, I have one last idea: instead of returning const-references to topic, data, and instances. We could simply return non-owning const-pointers, which are non-null if and only if the respective type is active: if (auto data = msg.data()) f(*data) else g(*msg.status()) I've also updated how to do status/error handling. See the documentation at http://bro.github.io/broker/comm.html#status-and-error-handling for details. Matthias From vallentin at icir.org Mon Jan 9 15:27:27 2017 From: vallentin at icir.org (Matthias Vallentin) Date: Mon, 9 Jan 2017 15:27:27 -0800 Subject: [Bro-Dev] [Bro-Commits] [git/bro] topic/robin/dynamic-cast: Add experimental "is" and "as" operators. (dabe125) In-Reply-To: <201701031633.v03GXq7S017723@bro-ids.icir.org> References: <201701031633.v03GXq7S017723@bro-ids.icir.org> Message-ID: <20170109232727.GP30604@shogun.local> > On branch : topic/robin/dynamic-cast > Link : https://github.com/bro/bro/commit/dabe125fe8fab80ea1f678844b872b369764fd80 I've tried branching away from topic/robin/dynamic-cast for Broker integration, but get a compile error in parse.y: [ 86%] [BISON][Parser] Building parser with bison 2.3 parse.y:6.9-15: syntax error, unexpected identifier, expecting string How do I fix it? Matthias From robin at icir.org Tue Jan 10 05:51:48 2017 From: robin at icir.org (Robin Sommer) Date: Tue, 10 Jan 2017 05:51:48 -0800 Subject: [Bro-Dev] dynamic-cast branch (Re: [Bro-Commits] [git/bro] topic/robin/dynamic-cast: Add experimental "is" and "as" operators. (dabe125)) In-Reply-To: <20170109232727.GP30604@shogun.local> References: <201701031633.v03GXq7S017723@bro-ids.icir.org> <20170109232727.GP30604@shogun.local> Message-ID: <20170110135148.GU60510@icir.org> On Mon, Jan 09, 2017 at 15:27 -0800, you wrote: > [ 86%] [BISON][Parser] Building parser with bison 2.3 > parse.y:6.9-15: syntax error, unexpected identifier, expecting string It's this line: > %define lr.type ielr Looks like that's a feature of newer bison versions, I'm using 3.0.4. I added line that because that bison is otherwise having trouble with the grammar extensions I added. I just tried 2.3 and it doesn't seem to know that option yet. Funny thing is that if I just remove the line, it seems to just work fine with 2.3. I haven't compiled it all the way through but at least 2.3 isn't complaining about the extensionss (it just needs the expected shift/reduce conflicts adapted). So not sure what the right solution is but for now: either upgrade bison, or remove the line and keep an eye on if things work correctly. Robin -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From robin at icir.org Tue Jan 10 06:21:54 2017 From: robin at icir.org (Robin Sommer) Date: Tue, 10 Jan 2017 06:21:54 -0800 Subject: [Bro-Dev] Rethinking Broker's blocking API In-Reply-To: <20170109193423.GL30604@shogun.local> References: <20170102210721.GG2037@ninja.local> <20170104204850.GA89520@icir.org> <20170106010432.GD30604@shogun.local> <20170106131638.GK39085@icir.org> <20170106183250.GE30604@shogun.local> <20170106214419.GN39085@icir.org> <20170109193423.GL30604@shogun.local> Message-ID: <20170110142154.GY60510@icir.org> On Mon, Jan 09, 2017 at 11:34 -0800, you wrote: > To distinguish between the two status, I use operator bool. I don't see that in the "status and error handling" section. Would I do "if (! status) { }"? That doesn't seem quite intuitive. I think a method with a descriptive name would be better here. > if (auto data = msg.data()) > f(*data) > else > g(*msg.status()) For this I think I prefer the boolean version: "if (msg) { References: <20170102210721.GG2037@ninja.local> <20170104204850.GA89520@icir.org> <20170106010432.GD30604@shogun.local> <20170106131638.GK39085@icir.org> <20170106183250.GE30604@shogun.local> <20170106214419.GN39085@icir.org> <20170109193423.GL30604@shogun.local> <20170110142154.GY60510@icir.org> Message-ID: <20170110153947.GS30604@shogun.local> > > To distinguish between the two status, I use operator bool. > > I don't see that in the "status and error handling" section. Would I > do "if (! status) { }"? That doesn't seem quite > intuitive. I think a method with a descriptive name would be better > here. Sorry, that was misleading. Statuses don't provide an operator bool. They could, however, to distinguish them from errors. > > if (auto data = msg.data()) > > f(*data) > > else > > g(*msg.status()) > > For this I think I prefer the boolean version: "if (msg) { msg.data() })". Okay, I'll switch that over. I prefer the other way, though :-). Matthias From vallentin at icir.org Tue Jan 10 07:52:17 2017 From: vallentin at icir.org (Matthias Vallentin) Date: Tue, 10 Jan 2017 07:52:17 -0800 Subject: [Bro-Dev] dynamic-cast branch (Re: [Bro-Commits] [git/bro] topic/robin/dynamic-cast: Add experimental "is" and "as" operators. (dabe125)) In-Reply-To: <20170110135148.GU60510@icir.org> References: <201701031633.v03GXq7S017723@bro-ids.icir.org> <20170109232727.GP30604@shogun.local> <20170110135148.GU60510@icir.org> Message-ID: <20170110155217.GT30604@shogun.local> > So not sure what the right solution is but for now: either upgrade > bison, or remove the line and keep an eye on if things work correctly. Upgrading Bison worked just fine, thanks. Matthias From vallentin at icir.org Tue Jan 10 08:28:31 2017 From: vallentin at icir.org (Matthias Vallentin) Date: Tue, 10 Jan 2017 08:28:31 -0800 Subject: [Bro-Dev] Rethinking Broker's blocking API In-Reply-To: <20170110153947.GS30604@shogun.local> References: <20170102210721.GG2037@ninja.local> <20170104204850.GA89520@icir.org> <20170106010432.GD30604@shogun.local> <20170106131638.GK39085@icir.org> <20170106183250.GE30604@shogun.local> <20170106214419.GN39085@icir.org> <20170109193423.GL30604@shogun.local> <20170110142154.GY60510@icir.org> <20170110153947.GS30604@shogun.local> Message-ID: <20170110162831.GU30604@shogun.local> > Sorry, that was misleading. Statuses don't provide an operator bool. > They could, however, to distinguish them from errors. Mulling over this more, I'm not sure if it's clear which status codes constitute an error. For example, there's a peer_lost and peer_recovered status code. Is only the first an error? Some users may consider peer churn normal. Here's the list of all status codes: enum class sc : uint8_t { /// The unspecified default error code. unspecified = 1, /// Successfully added a new peer. peer_added, /// Successfully removed a peer. peer_removed, /// Version incompatibility. peer_incompatible, /// Referenced peer does not exist. peer_invalid, /// Remote peer not listening. peer_unavailable, /// An peering request timed out. peer_timeout, /// Lost connection to peer. peer_lost, /// Re-gained connection to peer. peer_recovered, /// Master with given name already exist. master_exists, /// Master with given name does not exist. no_such_master, /// The given data store key does not exist. no_such_key, /// The store operation timed out. request_timeout, /// The operation expected a different type than provided type_clash, /// The data value cannot be used to carry out the desired operation. invalid_data, /// The storage backend failed to execute the operation. backend_failure, }; If we provided operator bool() for statuses, then it would be true for peer_added, peer_removed_, peer_recovered, and false for all others. This selection seems arguable to me, which is why I'm inclined to let users probe for specific instances themselves. Matthias From robin at icir.org Tue Jan 10 08:46:27 2017 From: robin at icir.org (Robin Sommer) Date: Tue, 10 Jan 2017 08:46:27 -0800 Subject: [Bro-Dev] Rethinking Broker's blocking API In-Reply-To: <20170110162831.GU30604@shogun.local> References: <20170102210721.GG2037@ninja.local> <20170104204850.GA89520@icir.org> <20170106010432.GD30604@shogun.local> <20170106131638.GK39085@icir.org> <20170106183250.GE30604@shogun.local> <20170106214419.GN39085@icir.org> <20170109193423.GL30604@shogun.local> <20170110142154.GY60510@icir.org> <20170110153947.GS30604@shogun.local> <20170110162831.GU30604@shogun.local> Message-ID: <20170110164627.GB71388@icir.org> On Tue, Jan 10, 2017 at 08:28 -0800, you wrote: > If we provided operator bool() for statuses, then it would be true for > peer_added, peer_removed_, peer_recovered, and false for all others. > This selection seems arguable to me, which is why I'm inclined to let > users probe for specific instances themselves. I see the challenge but I think we need some way to differentiate serious errors from expected updates, otherwise we're back at writing switch statements that need to comprehensively list all cases. One can always post-filter if, e.g., one does consider status X not an error even though it's flagged as such. Instead of a binary error yes/no, what about levels along these lines: (1) Error: *we* did something seriously wrong; (2) Warning: something's seems off, including problems with peers; and (3) Info: just an update on activity. It's not clear-cut of course but it would still be good to have some default classification for cases that one doesn't handle directly (and if it's only for then logging as error/warning/info). One could then also compare the level directly with the status object: "if ( status == ERROR ) ..." Robin -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From vallentin at icir.org Tue Jan 10 09:38:47 2017 From: vallentin at icir.org (Matthias Vallentin) Date: Tue, 10 Jan 2017 09:38:47 -0800 Subject: [Bro-Dev] Rethinking Broker's blocking API In-Reply-To: <20170110164627.GB71388@icir.org> References: <20170104204850.GA89520@icir.org> <20170106010432.GD30604@shogun.local> <20170106131638.GK39085@icir.org> <20170106183250.GE30604@shogun.local> <20170106214419.GN39085@icir.org> <20170109193423.GL30604@shogun.local> <20170110142154.GY60510@icir.org> <20170110153947.GS30604@shogun.local> <20170110162831.GU30604@shogun.local> <20170110164627.GB71388@icir.org> Message-ID: <20170110173847.GV30604@shogun.local> > I see the challenge but I think we need some way to differentiate > serious errors from expected updates, otherwise we're back at writing > switch statements that need to comprehensively list all cases. I agree that writing switch statements is not very productive. From a user perspective, it's important to distinguish whether I can ignore a status update, or whether I have to react to it. I would consider more fine-grained classifications arbitrary. In particular, I find 3 levels too complex. Users will wonder "wait, was this an info or a warning?" and then have to go back to the documentation. How about making it explicit by providing two types of codes, status and errors: enum class sc : uint8_t { unspecified = 1, peer_added, peer_removed, peer_recovered, }; enum class ec : uint8_t { unspecified = 1, peer_incompatible, peer_invalid, peer_unavailable, peer_lost, peer_timeout, master_exists, no_such_master, ... }; This is close to the original design, except that the peer_* codes are now split into errors and status, and that we still have a single status class to combine them both: auto s = msg.status(); if (s.error()) // Compare against error codes for details. else // Compare against status codes for details. Or simpler: avoiding the explicit distinction of error and status codes and let the function status::error() return true for what we deem actionable errors. Matthias From robin at icir.org Tue Jan 10 15:24:01 2017 From: robin at icir.org (Robin Sommer) Date: Tue, 10 Jan 2017 15:24:01 -0800 Subject: [Bro-Dev] Rethinking Broker's blocking API In-Reply-To: <20170110173847.GV30604@shogun.local> References: <20170106010432.GD30604@shogun.local> <20170106131638.GK39085@icir.org> <20170106183250.GE30604@shogun.local> <20170106214419.GN39085@icir.org> <20170109193423.GL30604@shogun.local> <20170110142154.GY60510@icir.org> <20170110153947.GS30604@shogun.local> <20170110162831.GU30604@shogun.local> <20170110164627.GB71388@icir.org> <20170110173847.GV30604@shogun.local> Message-ID: <20170110232401.GD71388@icir.org> On Tue, Jan 10, 2017 at 09:38 -0800, you wrote: > From a user perspective, it's important to distinguish whether I can > ignore a status update, or whether I have to react to it. Yep, exactly. > I would consider more fine-grained classifications arbitrary. In > particular, I find 3 levels too complex. Well, weren't you just saying that two levels are arbitrary? :) I tried to make it a bit less arbitrary by providing more options. Your new suggestion goes to back yes/no in terms of whether it's an error or not, which I think is ok, but you seemed concerned about that. > auto s = msg.status(); > if (s.error()) > // Compare against error codes for details. > else > // Compare against status codes for details. I like this (with the two separate types for error/status code, as that makes the distinction explicit). Could we then now also lift the error() method into the message class? So "if (msg.error())" would be a shortcut for "if(msg.status().error()? (And then we'd back be where we started I believe. :-) Btw, there's one more complexity with all this: when one gets a status, one cannot tell which operation it belongs to, as it depends on how thing got queued and processed internally. That can make it hard to react to something more specifically. But that's something we can't avoid with the asynchronous processing. Robin -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From vallentin at icir.org Tue Jan 10 18:49:21 2017 From: vallentin at icir.org (Matthias Vallentin) Date: Tue, 10 Jan 2017 18:49:21 -0800 Subject: [Bro-Dev] Rethinking Broker's blocking API In-Reply-To: <20170110232401.GD71388@icir.org> References: <20170106131638.GK39085@icir.org> <20170106183250.GE30604@shogun.local> <20170106214419.GN39085@icir.org> <20170109193423.GL30604@shogun.local> <20170110142154.GY60510@icir.org> <20170110153947.GS30604@shogun.local> <20170110162831.GU30604@shogun.local> <20170110164627.GB71388@icir.org> <20170110173847.GV30604@shogun.local> <20170110232401.GD71388@icir.org> Message-ID: <20170111024921.GW30604@shogun.local> > Could we then now also lift the error() method into the message class? > So "if (msg.error())" would be a shortcut for > "if(msg.status().error()? (And then we'd back be where we started I > believe. :-) Done. And yes, we've done the full circle. ;-) But at least we're in full agreement now. Sometimes converging takes a little longer. > Btw, there's one more complexity with all this: when one gets a > status, one cannot tell which operation it belongs to, as it depends > on how thing got queued and processed internally. That can make it > hard to react to something more specifically. But that's something we > can't avoid with the asynchronous processing. I think we can address this. First, the status code tells roughly what went wrong. Second, a status has a context() method that returns additional information when available. For example, all peering-related statuses include a broker::endpoint_info instance to figure out exactly what peer created the issue. If we need the fine granularity, all we need is to add the necessary context information. (This is enforced at compile time, by the way; for a given status code, the context is either a T or not present, but not of a different type.) Matthias From vern at icir.org Wed Jan 11 15:18:44 2017 From: vern at icir.org (Vern Paxson) Date: Wed, 11 Jan 2017 15:18:44 -0800 Subject: [Bro-Dev] [Bro] Segmentation fault while using own signature. In-Reply-To: (Tue, 03 Jan 2017 17:12:27 EST). Message-ID: <20170111231844.BCE972C403F@rock.ICSI.Berkeley.EDU> Did anyone follow up on this? Vern -------------- next part -------------- An embedded message was scrubbed... From: fatema bannatwala Subject: [Bro] Segmentation fault while using own signature. Date: Tue, 3 Jan 2017 17:12:27 -0500 Size: 7592 Url: http://mailman.icsi.berkeley.edu/pipermail/bro-dev/attachments/20170111/4431ad54/attachment.mht From johanna at icir.org Thu Jan 12 05:21:43 2017 From: johanna at icir.org (Johanna Amann) Date: Thu, 12 Jan 2017 14:21:43 +0100 Subject: [Bro-Dev] plugins/hooks test fail in the new year Message-ID: <20170112132143.3mb4axjcq5keyu7m@Beezling.fritz.box> Hi, plugins/hooks currently fails because of the changed year number: 0.000000 | HookCallFunction strftime(%Y, XXXXXXXXXX.XXXXXX) 0.000000 | HookCallFunction string_to_pattern((^\.?|\.)()$, F) 0.000000 | HookCallFunction sub((^\.?|\.)(~~)$, <...>/, ) -0.000000 | HookCallFunction to_count(2016) +0.000000 | HookCallFunction to_count(2017) After a slight amount of digging, the culprit is the following part of init-bare.bro: # A bit of functionality for 2.5 global brocon:event (x:count) ;event bro_init (){event brocon ( to_count (strftime ("%Y" ,current_time())));} While I know this is cute, currently this will make the test fail on every year change. Do we perhaps want to either remove this, or at least move this somewhere outside of base/? Johanna From seth at icir.org Fri Jan 13 12:30:09 2017 From: seth at icir.org (Seth Hall) Date: Fri, 13 Jan 2017 15:30:09 -0500 Subject: [Bro-Dev] plugins/hooks test fail in the new year In-Reply-To: <20170112132143.3mb4axjcq5keyu7m@Beezling.fritz.box> References: <20170112132143.3mb4axjcq5keyu7m@Beezling.fritz.box> Message-ID: > On Jan 12, 2017, at 8:21 AM, Johanna Amann wrote: > > While I know this is cute, currently this will make the test fail on every year > change. Do we perhaps want to either remove this, or at least move this somewhere > outside of base/? We can remove it. The whole thing with the mugs kind of fell flat anyway. :( .Seth -- Seth Hall International Computer Science Institute (Bro) because everyone has a network http://www.bro.org/ From jan.grashoefer at gmail.com Sun Jan 15 14:49:56 2017 From: jan.grashoefer at gmail.com (=?UTF-8?Q?Jan_Grash=c3=b6fer?=) Date: Sun, 15 Jan 2017 23:49:56 +0100 Subject: [Bro-Dev] Testing and Docs for Packages Message-ID: Hi all, building some small packages and playing around with a package for the af_packet plugin (https://github.com/J-Gras/bro-af_packet-plugin), I came across a question: How to deal with testing? For the intel-extensions package (https://github.com/J-Gras/intel-extensions) I adapted some scripts I found. While executing the tests is not an issue (maybe only for me), they are hidden somewhere in the bro-pkg directories, as of course only the scripts inside the script_dir are moved. In general I think, making test cases available for users of a package could be quite helpful. Further, I think we have also already mentioned the possibility of compatibility checking regarding the installed Bro version by executing tests. Thus I would propose introducing test_dir to bro-pkg.meta for tests and a test command to execute the tests for bro-pkg as a first step. Another thing that would be great to have for packages would be documentation. I've experimented with the broxygen feature but the doc about generating docs is not that exhaustive ;) I think providing an easy-to-use mechanism to generate documentation for thrid-party scripts would be great. Ideally the generated documentation would link to Bro's online doc for built-in types and scripts. What do you think about testing and docs for packages? Jan From jsiwek at illinois.edu Mon Jan 16 10:45:52 2017 From: jsiwek at illinois.edu (Siwek, Jon) Date: Mon, 16 Jan 2017 18:45:52 +0000 Subject: [Bro-Dev] Testing and Docs for Packages In-Reply-To: References: Message-ID: <81C2CC67-91AB-4674-A633-5E5658CEDA97@illinois.edu> > On Jan 15, 2017, at 4:49 PM, Jan Grash?fer wrote: > > In general I think, making test cases available for users of a package > could be quite helpful. Further, I think we have also already mentioned > the possibility of compatibility checking regarding the installed Bro > version by executing tests. Thus I would propose introducing test_dir to > bro-pkg.meta for tests and a test command to execute the tests for > bro-pkg as a first step. Yes, seems useful. I?d do it like: 1) Add `bro-pkg test ` command. 2) Add ?test_command? field to bro-pkg.meta The ?test_command? is more general than ?test_dir" ? the command could just `cd test_dir` if needed and there?s no other reason bro-pkg needs to know the dir where tests are stored, is there? Other questions: 1) Is it fine for `bro-pkg test ` to operate on the installed version of the package or are there expectations of testing a package in an isolated sandbox without installing it? I think the former is more useful since it may catch odd inter-package conflicts that wouldn?t show up when testing in isolation. 2) Should we put btest on PyPi, add it as a dependency to bro-pkg, and make it the canonical testing framework for packages? This gives devs a straightforward way to proceed w/ writing tests and guarantees that bro-pkg users always have the ability to run them. (There is a ?btest? package on PyPi, but not the one we know, so not sure how to resolve that...) > Another thing that would be great to have for packages would be > documentation. I've experimented with the broxygen feature but the doc > about generating docs is not that exhaustive ;) I think providing an > easy-to-use mechanism to generate documentation for thrid-party scripts > would be great. Ideally the generated documentation would link to Bro's > online doc for built-in types and scripts. If the problem is that there?s a lack of examples/templates for generating script API docs via broxygen or that it simply doesn?t work at the moment, then yes, that?s something to fix. But regarding the direction of autogenerated package docs in general, maybe it makes sense to work on that in conjunction with a web-frontend for package sources (e.g. a package repository browser). A package would be able to generate its docs independent of that, but if the web-frontend is going to become a go-to place for looking at package info/stats/docs and it takes care of creating all that without package authors having to do anything, then that seems like the superior/common use-case. - Jon From jan.grashoefer at gmail.com Mon Jan 16 12:42:53 2017 From: jan.grashoefer at gmail.com (=?UTF-8?Q?Jan_Grash=c3=b6fer?=) Date: Mon, 16 Jan 2017 21:42:53 +0100 Subject: [Bro-Dev] Testing and Docs for Packages In-Reply-To: <81C2CC67-91AB-4674-A633-5E5658CEDA97@illinois.edu> References: <81C2CC67-91AB-4674-A633-5E5658CEDA97@illinois.edu> Message-ID: > Yes, seems useful. I?d do it like: > > 1) Add `bro-pkg test ` command. > 2) Add ?test_command? field to bro-pkg.meta > > The ?test_command? is more general than ?test_dir" ? the command could just `cd test_dir` if needed and there?s no other reason bro-pkg needs to know the dir where tests are stored, is there? Although I am not sure whether this degree of flexibility is really needed, I would assume it doesn't hurt either. In any way, the user should be informed that "something" will be executed and the user should trust the packet author/source. > 1) Is it fine for `bro-pkg test ` to operate on the installed version of the package or are there expectations of testing a package in an isolated sandbox without installing it? I think the former is more useful since it may catch odd inter-package conflicts that wouldn?t show up when testing in isolation. I think testing on the installed version is fine. Installation might be in particular necessary for packages containing plugins. > 2) Should we put btest on PyPi, add it as a dependency to bro-pkg, and make it the canonical testing framework for packages? This gives devs a straightforward way to proceed w/ writing tests and guarantees that bro-pkg users always have the ability to run them. Ha, I forgot that bro-pkg is published using PyPi. Adding btest as a dependency sounds great to me. > If the problem is that there?s a lack of examples/templates for generating script API docs via broxygen or that it simply doesn?t work at the moment, then yes, that?s something to fix. Looking at https://www.bro.org/development/howtos/autodoc.html, I wasn't able to generate anything for my custom script. Looking into the Bro code I could deduce the meaning of the broxygen.conf values and was able to generate reST. I didn't try to generate HTML. A 3-step guide how to generate doc for a custom script using the HTML template would be great. > But regarding the direction of autogenerated package docs in general, maybe it makes sense to work on that in conjunction with a web-frontend for package sources (e.g. a package repository browser). Cool! I wasn't aware that a web-frontend is on the list. In that case, any autogeneration of docs is indeed something to consider in this context. Best regards, Jan From johanna at icir.org Mon Jan 16 12:47:29 2017 From: johanna at icir.org (Johanna Amann) Date: Mon, 16 Jan 2017 12:47:29 -0800 Subject: [Bro-Dev] Testing and Docs for Packages In-Reply-To: <81C2CC67-91AB-4674-A633-5E5658CEDA97@illinois.edu> References: <81C2CC67-91AB-4674-A633-5E5658CEDA97@illinois.edu> Message-ID: <20170116204729.5aenjl3ts2idhmi6@Beezling.local> Just to add my two cents to this, because automated testing is actually one of the things that I really think package managers should do... On Mon, Jan 16, 2017 at 06:45:52PM +0000, Siwek, Jon wrote: > 1) Add `bro-pkg test ` command. Might it also make sense to just run the test on installation, before the package is actually installed, to see if it works on the environment of the user? This might make it much easier for users (& developers) to identify early when it is something wrong. And bro-pkg could just abort (or ask a user if it should continue) if a test fails. > 2) Add ?test_command? field to bro-pkg.meta > > The ?test_command? is more general than ?test_dir" ? the command could just `cd test_dir` if needed and there?s no other reason bro-pkg needs to know the dir where tests are stored, is there? > > Other questions: > > 1) Is it fine for `bro-pkg test ` to operate on the installed version of the package or are there expectations of testing a package in an isolated sandbox without installing it? I think the former is more useful since it may catch odd inter-package conflicts that wouldn?t show up when testing in isolation. I actually think it would be neat to do this isolated, especially given that this enables testing before installing. It also makes it easier to create something like "smokers" (Bro installations that just tro tu run all testsuites of all available packages with a newer version to see if something went wrong). Running on the installed version might be a neat bonus, but I actually see the other as more interesting. Johanna From jsiwek at illinois.edu Mon Jan 16 20:01:19 2017 From: jsiwek at illinois.edu (Siwek, Jon) Date: Tue, 17 Jan 2017 04:01:19 +0000 Subject: [Bro-Dev] Testing and Docs for Packages In-Reply-To: <20170116204729.5aenjl3ts2idhmi6@Beezling.local> References: <81C2CC67-91AB-4674-A633-5E5658CEDA97@illinois.edu> <20170116204729.5aenjl3ts2idhmi6@Beezling.local> Message-ID: <59FDA987-0A43-4491-914B-2F30BE41E551@illinois.edu> >> 1) Add `bro-pkg test ` command. > > Might it also make sense to just run the test on installation, before the > package is actually installed, to see if it works on the environment of > the user? Yes, I like that idea. (I?d also want a flag or config option to opt-out of that behavior). > I actually think it would be neat to do this isolated, especially given > that this enables testing before installing. Not sure I follow. Can you explain further? >From a typical user perspective, I think they would care more that the package?s tests pass in the final, installed state and it plays nice with any other site-specific stuff they have going on. Aborting an installation on test failure is also still possible ? instead of bro-pkg cleaning up an isolated sandbox, it does the standard ?remove? operation to delete installed files. > It also makes it easier to > create something like "smokers" (Bro installations that just tro tu run > all testsuites of all available packages with a newer version to see if > something went wrong). Can you also go into more detail on what you?re thinking there? If there's concerns about accidentally corrupting an existing/production bro installation, the alternative I?d suggest would be to set up a separate bro-pkg config file for the smoke tests that would have bro-pkg install stuff in an isolated location. This allows users to explicitly define the testing sandbox for themselves. - Jon From robin at icir.org Wed Jan 18 08:28:08 2017 From: robin at icir.org (Robin Sommer) Date: Wed, 18 Jan 2017 08:28:08 -0800 Subject: [Bro-Dev] Testing and Docs for Packages In-Reply-To: <59FDA987-0A43-4491-914B-2F30BE41E551@illinois.edu> References: <81C2CC67-91AB-4674-A633-5E5658CEDA97@illinois.edu> <20170116204729.5aenjl3ts2idhmi6@Beezling.local> <59FDA987-0A43-4491-914B-2F30BE41E551@illinois.edu> Message-ID: <20170118162808.GD80113@icir.org> I also think it would be quite useful to test packages before installing them, that gives a chance to catch problems before changing anything (including things like: missing/broken/wrong dependencies; lack of something OS-specific the package needs (say, it's a Linux-only plugin); generally things that are special about the local Bro installation) Robin On Tue, Jan 17, 2017 at 04:01 +0000, you wrote: > > >> 1) Add `bro-pkg test ` command. > > > > Might it also make sense to just run the test on installation, before the > > package is actually installed, to see if it works on the environment of > > the user? > > Yes, I like that idea. (I?d also want a flag or config option to opt-out of that behavior). > > > I actually think it would be neat to do this isolated, especially given > > that this enables testing before installing. > > Not sure I follow. Can you explain further? > > From a typical user perspective, I think they would care more that the package?s tests pass in the final, installed state and it plays nice with any other site-specific stuff they have going on. Aborting an installation on test failure is also still possible ? instead of bro-pkg cleaning up an isolated sandbox, it does the standard ?remove? operation to delete installed files. > > > It also makes it easier to > > create something like "smokers" (Bro installations that just tro tu run > > all testsuites of all available packages with a newer version to see if > > something went wrong). > > Can you also go into more detail on what you?re thinking there? > > If there's concerns about accidentally corrupting an existing/production bro installation, the alternative I?d suggest would be to set up a separate bro-pkg config file for the smoke tests that would have bro-pkg install stuff in an isolated location. This allows users to explicitly define the testing sandbox for themselves. > > - Jon > > _______________________________________________ > bro-dev mailing list > bro-dev at bro.org > http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev > -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From asharma at lbl.gov Wed Jan 18 09:29:14 2017 From: asharma at lbl.gov (Aashish Sharma) Date: Wed, 18 Jan 2017 09:29:14 -0800 Subject: [Bro-Dev] help Reading the backtrace Message-ID: <20170118172912.GB92684@mac-822.local> So I am running a new detection package and everything seemed right but somehow since yesterday each worker is running at 5.7% to 6.3% CPU and not generating logs. The backtrace shows the following and how much (%) CPU is spending on what functions. Can someone help me read why might BRO spend 17.5% of its time in bro-2.5/src/Dict.cc:void* Dictionary::NextEntry(HashKey*& h, IterCookie*& cookie, int return_hash) const Here is functions and time spent in each of them: bro`_ZN8iosource4pcap10PcapSource17ExtractNextPacketEP6Packet 1 0.1% bro`_ZNK13PriorityQueue3TopEv 1 0.1% bro`_ZNK7BroFunc4CallEP8ValPListP5Frame 1 0.1% bro`_Z15net_update_timed 1 0.1% bro`_ZN16RemoteSerializer6GetFdsEPN8iosource6FD_SetES2_S2_ 1 0.1% bro`_ZN8EventMgr5DrainEv 1 0.1% bro`_ZNK15EventHandlerPtrcvbEv 1 0.1% bro`_ZN8iosource6FD_Set6InsertEi 1 0.1% bro`_ZNK11ChunkedIOFd12ExtraReadFDsEv 1 0.1% bro`_ZN13PriorityQueue10BubbleDownEi 1 0.1% bro`0x699d60 2 0.1% bro`_ZNK8iosource8IOSource6IsOpenEv 2 0.1% bro`_ZN8iosource6FD_Set6InsertERKS0_ 2 0.1% bro`_ZNK8iosource6FD_Set5ReadyEP6fd_set 3 0.2% bro`_ZNK14DictEntryPListixEi 3 0.2% bro`_ZN8iosource6PktSrc25ExtractNextPacketInternalEv 4 0.3% bro`_ZNSt3__16__treeIiNS_4lessIiEENS_9allocatorIiEEE15__insert_uniqueERKi 4 0.3% bro`_ZNK8iosource6FD_Set3SetEP6fd_set 5 0.3% bro`0x69a610 5 0.3% bro`_ZNSt3__16__treeIiNS_4lessIiEENS_9allocatorIiEEE7destroyEPNS_11__tree_nodeIiPvEE 5 0.3% bro`0x699c00 6 0.4% bro`_ZNSt3__16__treeIiNS_4lessIiEENS_9allocatorIiEEE16__construct_nodeIJRKiEEENS_10unique_ptrINS_11__tree_nodeIiPvEENS_22__tree_node_destructorINS3_ISC_EEEEEEDpOT_ 6 0.4% bro`0x69ad50 7 0.5% bro`_ZN7HashKeyD2Ev 7 0.5% bro`_ZN8iosource7Manager11FindSoonestEPd 7 0.5% bro`_ZN7HashKeyC2EPKvim 11 0.7% bro`_ZNK18TableEntryValPDict9NextEntryERP7HashKeyRP10IterCookie 12 0.8% bro`_ZN8TableVal8DoExpireEd 16 1.1% bro`_ZNK7HashKey7CopyKeyEPKvi 16 1.1% bro`_ZNK13TableEntryVal16ExpireAccessTimeEv 164 11.1% bro`_ZNK8BaseList6lengthEv 170 11.5% bro`_ZNK8BaseListixEi 173 11.7% bro`_ZNK10Dictionary9NextEntryERP7HashKeyRP10IterCookiei 259 17.5% Aashish From jazoff at illinois.edu Wed Jan 18 09:34:42 2017 From: jazoff at illinois.edu (Azoff, Justin S) Date: Wed, 18 Jan 2017 17:34:42 +0000 Subject: [Bro-Dev] help Reading the backtrace In-Reply-To: <20170118172912.GB92684@mac-822.local> References: <20170118172912.GB92684@mac-822.local> Message-ID: > On Jan 18, 2017, at 12:29 PM, Aashish Sharma wrote: > > So I am running a new detection package It was stable before you added the new scripts? Are the new scripts publicly available? -- - Justin Azoff From jan.grashoefer at gmail.com Wed Jan 18 09:37:08 2017 From: jan.grashoefer at gmail.com (=?UTF-8?Q?Jan_Grash=c3=b6fer?=) Date: Wed, 18 Jan 2017 18:37:08 +0100 Subject: [Bro-Dev] help Reading the backtrace In-Reply-To: <20170118172912.GB92684@mac-822.local> References: <20170118172912.GB92684@mac-822.local> Message-ID: <703ccb53-c8e8-c47c-6d10-764a47ced599@gmail.com> Hi Aashish, > So I am running a new detection package and everything seemed right but somehow since yesterday each worker is running at 5.7% to 6.3% CPU and not generating logs. my guess would be that the script makes (heavy) use of tables and table expiration, right? Can you share the script? Jan From asharma at lbl.gov Wed Jan 18 10:40:38 2017 From: asharma at lbl.gov (Aashish Sharma) Date: Wed, 18 Jan 2017 10:40:38 -0800 Subject: [Bro-Dev] help Reading the backtrace In-Reply-To: <703ccb53-c8e8-c47c-6d10-764a47ced599@gmail.com> References: <20170118172912.GB92684@mac-822.local> <703ccb53-c8e8-c47c-6d10-764a47ced599@gmail.com> Message-ID: <20170118184037.GC92684@mac-822.local> Yes, I have been making heavy use of tables ( think a million entries a day and million expires a day) Let me figure out a way to upload the scripts on github or send them yours and justin's way otherwise. Strangely this code kept running fine for last month and reasonably stable. I am not sure what little thing I added/changed that has caused bro to run but all workers in uwait state with 6% CPU. (I'll be doing svn diffs to figure out) Seems like bro is stuck in: 0x00000004039ccadc in _umtx_op_err () from /lib/libthr.so.3 (gdb) bt #0 0x00000004039ccadc in _umtx_op_err () from /lib/libthr.so.3 #1 0x00000004039c750b in _thr_umtx_timedwait_uint () from /lib/libthr.so.3 #2 0x00000004039cea06 in ?? () from /lib/libthr.so.3 #3 0x00000000009042ee in threading::Queue::Get (this=0x404543038) at /home/bro/install/bro-2.5/src/threading/Queue.h:173 #4 0x0000000000902a31 in threading::MsgThread::RetrieveIn (this=0x404543000) at /home/bro/install/bro-2.5/src/threading/MsgThread.cc:349 #5 0x0000000000902ce4 in threading::MsgThread::Run (this=0x404543000) at /home/bro/install/bro-2.5/src/threading/MsgThread.cc:366 #6 0x00000000008fb952 in threading::BasicThread::launcher (arg=0x404543000) at /home/bro/install/bro-2.5/src/threading/BasicThread.cc:201 #7 0x00000004039c6260 in ?? () from /lib/libthr.so.3 #8 0x0000000000000000 in ?? () Backtrace stopped: Cannot access memory at address 0x7fffffbfe000 (gdb) Aashish On Wed, Jan 18, 2017 at 06:37:08PM +0100, Jan Grash?fer wrote: > Hi Aashish, > > > So I am running a new detection package and everything seemed right but somehow since yesterday each worker is running at 5.7% to 6.3% CPU and not generating logs. > > my guess would be that the script makes (heavy) use of tables and table > expiration, right? Can you share the script? > > Jan > _______________________________________________ > bro-dev mailing list > bro-dev at bro.org > http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev From jazoff at illinois.edu Wed Jan 18 11:27:17 2017 From: jazoff at illinois.edu (Azoff, Justin S) Date: Wed, 18 Jan 2017 19:27:17 +0000 Subject: [Bro-Dev] help Reading the backtrace In-Reply-To: <20170118184037.GC92684@mac-822.local> References: <20170118172912.GB92684@mac-822.local> <703ccb53-c8e8-c47c-6d10-764a47ced599@gmail.com> <20170118184037.GC92684@mac-822.local> Message-ID: <80444C79-3514-404A-BB04-0FF39E3E3108@illinois.edu> Yeah.. lots of expires may have something to do with it, your traceback shows TableEntryVal16ExpireAccessTimeEv But I also wonder what you are doing that is triggering Dictionary9NextEntryERP7HashKeyRP10IterCookiei which would be void* Dictionary::NextEntry(HashKey*& h, IterCookie*& cookie, int return_hash) const Tables should be fine, but I wonder what you're doing that is triggering so much iteration. -- - Justin Azoff > On Jan 18, 2017, at 1:40 PM, Aashish Sharma wrote: > > Yes, I have been making heavy use of tables ( think a million entries a day and million expires a day) > > Let me figure out a way to upload the scripts on github or send them yours and justin's way otherwise. > > Strangely this code kept running fine for last month and reasonably stable. I am not sure what little thing I added/changed that has caused bro to run but all workers in uwait state with 6% CPU. (I'll be doing svn diffs to figure out) > > Seems like bro is stuck in: > > 0x00000004039ccadc in _umtx_op_err () from /lib/libthr.so.3 > (gdb) bt > #0 0x00000004039ccadc in _umtx_op_err () from /lib/libthr.so.3 > #1 0x00000004039c750b in _thr_umtx_timedwait_uint () from /lib/libthr.so.3 > #2 0x00000004039cea06 in ?? () from /lib/libthr.so.3 > #3 0x00000000009042ee in threading::Queue::Get (this=0x404543038) at /home/bro/install/bro-2.5/src/threading/Queue.h:173 > #4 0x0000000000902a31 in threading::MsgThread::RetrieveIn (this=0x404543000) at /home/bro/install/bro-2.5/src/threading/MsgThread.cc:349 > #5 0x0000000000902ce4 in threading::MsgThread::Run (this=0x404543000) at /home/bro/install/bro-2.5/src/threading/MsgThread.cc:366 > #6 0x00000000008fb952 in threading::BasicThread::launcher (arg=0x404543000) at /home/bro/install/bro-2.5/src/threading/BasicThread.cc:201 > #7 0x00000004039c6260 in ?? () from /lib/libthr.so.3 > #8 0x0000000000000000 in ?? () > Backtrace stopped: Cannot access memory at address 0x7fffffbfe000 > (gdb) > > > > Aashish > > On Wed, Jan 18, 2017 at 06:37:08PM +0100, Jan Grash?fer wrote: >> Hi Aashish, >> >>> So I am running a new detection package and everything seemed right but somehow since yesterday each worker is running at 5.7% to 6.3% CPU and not generating logs. >> >> my guess would be that the script makes (heavy) use of tables and table >> expiration, right? Can you share the script? >> >> Jan >> _______________________________________________ >> bro-dev mailing list >> bro-dev at bro.org >> http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev > _______________________________________________ > bro-dev mailing list > bro-dev at bro.org > http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev From jan.grashoefer at gmail.com Thu Jan 19 01:55:45 2017 From: jan.grashoefer at gmail.com (=?UTF-8?Q?Jan_Grash=c3=b6fer?=) Date: Thu, 19 Jan 2017 10:55:45 +0100 Subject: [Bro-Dev] help Reading the backtrace In-Reply-To: <80444C79-3514-404A-BB04-0FF39E3E3108@illinois.edu> References: <20170118172912.GB92684@mac-822.local> <703ccb53-c8e8-c47c-6d10-764a47ced599@gmail.com> <20170118184037.GC92684@mac-822.local> <80444C79-3514-404A-BB04-0FF39E3E3108@illinois.edu> Message-ID: <976360f6-7a96-6f52-80b3-7bfc1570715f@gmail.com> > But I also wonder what you are doing that is triggering Dictionary9NextEntryERP7HashKeyRP10IterCookiei > > which would be > > void* Dictionary::NextEntry(HashKey*& h, IterCookie*& cookie, int return_hash) const > > Tables should be fine, but I wonder what you're doing that is triggering so much iteration. If I am not mistaken, tables are basically dictionaries (with the exception of subnet-indexed tables). The iterations should be related to how expiration works: As soon as the timer fires, the table is looped looking for expired entries. To prevent blocking in case of too many expired entries that have to be handled, only -number of entries are processed, following a delay of . Adjusting this parameters might help (see https://www.bro.org/sphinx/scripts/base/init-bare.bro.html?highlight=expire#id-table_expire_delay). Jan From agarciaillera at gmail.com Thu Jan 19 09:19:20 2017 From: agarciaillera at gmail.com (Alberto Garcia) Date: Thu, 19 Jan 2017 11:19:20 -0600 Subject: [Bro-Dev] fatal error: can't find base/init-bare.bro Message-ID: Hi, I've compiled bro from source to do some debugging. Once compiled I can't run bro since there is an error popping up: default at debian:~/bro$ ./build/src/bro fatal error: can't find base/init-bare.bro If I do the make install and then call bro from /usr/local/bro/bin/bro it works fine. What I should do to execute bro from the build directory? Thanks -- Alberto -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.icsi.berkeley.edu/pipermail/bro-dev/attachments/20170119/a1f07fc8/attachment.html From jsiwek at illinois.edu Thu Jan 19 09:38:36 2017 From: jsiwek at illinois.edu (Siwek, Jon) Date: Thu, 19 Jan 2017 17:38:36 +0000 Subject: [Bro-Dev] Testing and Docs for Packages In-Reply-To: <20170118162808.GD80113@icir.org> References: <81C2CC67-91AB-4674-A633-5E5658CEDA97@illinois.edu> <20170116204729.5aenjl3ts2idhmi6@Beezling.local> <59FDA987-0A43-4491-914B-2F30BE41E551@illinois.edu> <20170118162808.GD80113@icir.org> Message-ID: > On Jan 18, 2017, at 10:28 AM, Robin Sommer wrote: > > I also think it would be quite useful to test packages before > installing them Maybe I?m not so much questioning whether to run tests before or after installation, but rather if the testing sandbox should include everything from the current installation in addition to the new/updated packages. i.e. the pre-installation testing sandbox could be initialized with a copy of current installation dirs. I would think the user?s expectation when running tests is to answer the question ?will this bro-pkg operation break my setup?? You can?t answer that without testing within the context of their current installation environment. The reason for that is that packages have much potential to change core bro functionality and interfere with each other?s operation. But for that same reason, it may also make it much harder for people to write unit tests for their package that are precise enough to not cause harmless failures in the presence of other packages ? e.g. you couldn?t just check a baseline http.log as some other installed package could have altered it by adding a field, etc. Summary of approaches/tradeoffs: 1) separate testing environment for each package - worse at answering ?will this bro-pkg operation break my setup?? - easier to write stable tests 2) single testing environment for all packages - better at answering ?will this bro-pkg operation break my setup?? *if* package tests are written well - harder to write stable tests Neither seems great. I guess I plan to do (1) since it is easier on package authors and less likely to waste users time looking into harmless test failures (ones due to tests that are written too broadly). Let me know if there?s other ideas/suggestions. - Jon From asharma at lbl.gov Thu Jan 19 09:44:10 2017 From: asharma at lbl.gov (Aashish Sharma) Date: Thu, 19 Jan 2017 09:44:10 -0800 Subject: [Bro-Dev] help Reading the backtrace In-Reply-To: <976360f6-7a96-6f52-80b3-7bfc1570715f@gmail.com> References: <20170118172912.GB92684@mac-822.local> <703ccb53-c8e8-c47c-6d10-764a47ced599@gmail.com> <20170118184037.GC92684@mac-822.local> <80444C79-3514-404A-BB04-0FF39E3E3108@illinois.edu> <976360f6-7a96-6f52-80b3-7bfc1570715f@gmail.com> Message-ID: <20170119174407.GA28308@mac-822.local> > https://www.bro.org/sphinx/scripts/base/init-bare.bro.html?highlight=expire#id-table_expire_delay). This is very helpful. I have been testing out table_expire_delay but somehow wasn't looking at table_incremental_step I'll test out a million expires while teaking table_incremental_step to see what the limits are here. Still, to clearify, there might be a possibility that because at present table_incremental_step=5000, somehow expiring >> 5000 entries continiously every moment might cause cause Queue to deadlock resulting in BRO to stop packets processing ? Aashish On Thu, Jan 19, 2017 at 10:55:45AM +0100, Jan Grash?fer wrote: > > But I also wonder what you are doing that is triggering Dictionary9NextEntryERP7HashKeyRP10IterCookiei > > > > which would be > > > > void* Dictionary::NextEntry(HashKey*& h, IterCookie*& cookie, int return_hash) const > > > > Tables should be fine, but I wonder what you're doing that is triggering so much iteration. > > If I am not mistaken, tables are basically dictionaries (with the > exception of subnet-indexed tables). The iterations should be related to > how expiration works: As soon as the timer fires, the table is looped > looking for expired entries. To prevent blocking in case of too many > expired entries that have to be handled, only > -number of entries are processed, following a > delay of . Adjusting this parameters might help (see > https://www.bro.org/sphinx/scripts/base/init-bare.bro.html?highlight=expire#id-table_expire_delay). > > Jan From robin at icir.org Thu Jan 19 09:55:40 2017 From: robin at icir.org (Robin Sommer) Date: Thu, 19 Jan 2017 09:55:40 -0800 Subject: [Bro-Dev] help Reading the backtrace In-Reply-To: <20170119174407.GA28308@mac-822.local> References: <20170118172912.GB92684@mac-822.local> <703ccb53-c8e8-c47c-6d10-764a47ced599@gmail.com> <20170118184037.GC92684@mac-822.local> <80444C79-3514-404A-BB04-0FF39E3E3108@illinois.edu> <976360f6-7a96-6f52-80b3-7bfc1570715f@gmail.com> <20170119174407.GA28308@mac-822.local> Message-ID: <20170119175540.GE45500@icir.org> On Thu, Jan 19, 2017 at 09:44 -0800, you wrote: > Still, to clearify, there might be a possibility that because at > present table_incremental_step=5000, somehow expiring >> 5000 entries > continiously every moment might cause cause Queue to deadlock > resulting in BRO to stop packets processing ? It shouldn't deadlock. What I can see happening, depending on load and these parameters, is Bro spending most of its time going through the table to expire entries and only getting to few packets in between (so not complete stop of processing, but not getting much done either) Robin -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From johanna at icir.org Thu Jan 19 12:46:28 2017 From: johanna at icir.org (Johanna Amann) Date: Thu, 19 Jan 2017 12:46:28 -0800 Subject: [Bro-Dev] fatal error: can't find base/init-bare.bro In-Reply-To: References: Message-ID: <20170119204628.hmdwyy4iwsgagld4@wifi218.sys.ICSI.Berkeley.EDU> On Thu, Jan 19, 2017 at 11:19:20AM -0600, Alberto Garcia wrote: > Hi, > > I've compiled bro from source to do some debugging. Once compiled I can't > run bro since there is an error popping up: > > default at debian:~/bro$ ./build/src/bro > fatal error: can't find base/init-bare.bro > > If I do the make install and then call bro from /usr/local/bro/bin/bro it > works fine. > > What I should do to execute bro from the build directory? source build/bro-path-dev.sh should set all required environment variables. Johanna From agarciaillera at gmail.com Thu Jan 19 13:01:53 2017 From: agarciaillera at gmail.com (Alberto Garcia) Date: Thu, 19 Jan 2017 15:01:53 -0600 Subject: [Bro-Dev] fatal error: can't find base/init-bare.bro In-Reply-To: <20170119204628.hmdwyy4iwsgagld4@wifi218.sys.ICSI.Berkeley.EDU> References: <20170119204628.hmdwyy4iwsgagld4@wifi218.sys.ICSI.Berkeley.EDU> Message-ID: Easy one! thanks! On Thu, Jan 19, 2017 at 2:46 PM, Johanna Amann wrote: > On Thu, Jan 19, 2017 at 11:19:20AM -0600, Alberto Garcia wrote: > > Hi, > > > > I've compiled bro from source to do some debugging. Once compiled I can't > > run bro since there is an error popping up: > > > > default at debian:~/bro$ ./build/src/bro > > fatal error: can't find base/init-bare.bro > > > > If I do the make install and then call bro from /usr/local/bro/bin/bro it > > works fine. > > > > What I should do to execute bro from the build directory? > > source build/bro-path-dev.sh > > should set all required environment variables. > > Johanna > -- Alberto Garc?a Illera GPG Public Key: https://goo.gl/twKUUv -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.icsi.berkeley.edu/pipermail/bro-dev/attachments/20170119/470d113b/attachment.html From agarciaillera at gmail.com Thu Jan 19 14:01:10 2017 From: agarciaillera at gmail.com (Alberto Garcia) Date: Thu, 19 Jan 2017 16:01:10 -0600 Subject: [Bro-Dev] fatal error: can't find base/init-bare.bro In-Reply-To: References: <20170119204628.hmdwyy4iwsgagld4@wifi218.sys.ICSI.Berkeley.EDU> Message-ID: It actually works if i'm in a shell but not if I'm debugging it with dbg. It still shows the same error when executing main.cc:759 --> add_input_file("base/init-bare.bro"); Is there anything I need to execute from within dbg to set the environment variables while using gdb? Thank you again On Thu, Jan 19, 2017 at 3:01 PM, Alberto Garcia wrote: > Easy one! > > thanks! > > On Thu, Jan 19, 2017 at 2:46 PM, Johanna Amann wrote: > >> On Thu, Jan 19, 2017 at 11:19:20AM -0600, Alberto Garcia wrote: >> > Hi, >> > >> > I've compiled bro from source to do some debugging. Once compiled I >> can't >> > run bro since there is an error popping up: >> > >> > default at debian:~/bro$ ./build/src/bro >> > fatal error: can't find base/init-bare.bro >> > >> > If I do the make install and then call bro from /usr/local/bro/bin/bro >> it >> > works fine. >> > >> > What I should do to execute bro from the build directory? >> >> source build/bro-path-dev.sh >> >> should set all required environment variables. >> >> Johanna >> > > > > -- > Alberto Garc?a Illera > > GPG Public Key: https://goo.gl/twKUUv > -- Alberto Garc?a Illera GPG Public Key: https://goo.gl/twKUUv -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.icsi.berkeley.edu/pipermail/bro-dev/attachments/20170119/0410ab45/attachment.html From agarciaillera at gmail.com Thu Jan 19 14:32:09 2017 From: agarciaillera at gmail.com (Alberto Garcia) Date: Thu, 19 Jan 2017 16:32:09 -0600 Subject: [Bro-Dev] fatal error: can't find base/init-bare.bro In-Reply-To: References: <20170119204628.hmdwyy4iwsgagld4@wifi218.sys.ICSI.Berkeley.EDU> Message-ID: I've tried running: set exec-wrapper bash -c 'source /home/default/bro/build/bro-path-dev.sh' within GDB but it didn't work either. On Thu, Jan 19, 2017 at 4:01 PM, Alberto Garcia wrote: > It actually works if i'm in a shell but not if I'm debugging it with dbg. > It still shows the same error when executing main.cc:759 --> > add_input_file("base/init-bare.bro"); > > Is there anything I need to execute from within dbg to set the environment > variables while using gdb? > > Thank you again > > On Thu, Jan 19, 2017 at 3:01 PM, Alberto Garcia > wrote: > >> Easy one! >> >> thanks! >> >> On Thu, Jan 19, 2017 at 2:46 PM, Johanna Amann wrote: >> >>> On Thu, Jan 19, 2017 at 11:19:20AM -0600, Alberto Garcia wrote: >>> > Hi, >>> > >>> > I've compiled bro from source to do some debugging. Once compiled I >>> can't >>> > run bro since there is an error popping up: >>> > >>> > default at debian:~/bro$ ./build/src/bro >>> > fatal error: can't find base/init-bare.bro >>> > >>> > If I do the make install and then call bro from /usr/local/bro/bin/bro >>> it >>> > works fine. >>> > >>> > What I should do to execute bro from the build directory? >>> >>> source build/bro-path-dev.sh >>> >>> should set all required environment variables. >>> >>> Johanna >>> >> >> >> >> -- >> Alberto Garc?a Illera >> >> GPG Public Key: https://goo.gl/twKUUv >> > > > > -- > Alberto Garc?a Illera > > GPG Public Key: https://goo.gl/twKUUv > -- Alberto Garc?a Illera GPG Public Key: https://goo.gl/twKUUv -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.icsi.berkeley.edu/pipermail/bro-dev/attachments/20170119/9a87d011/attachment.html From jazoff at illinois.edu Thu Jan 19 14:40:45 2017 From: jazoff at illinois.edu (Azoff, Justin S) Date: Thu, 19 Jan 2017 22:40:45 +0000 Subject: [Bro-Dev] fatal error: can't find base/init-bare.bro In-Reply-To: References: <20170119204628.hmdwyy4iwsgagld4@wifi218.sys.ICSI.Berkeley.EDU> Message-ID: <5937C78F-3016-4D11-B0EF-E203F33EB862@illinois.edu> > On Jan 19, 2017, at 5:32 PM, Alberto Garcia wrote: > > I've tried running: > > set exec-wrapper bash -c 'source /home/default/bro/build/bro-path-dev.sh' > within GDB but it didn't work either. > > On Thu, Jan 19, 2017 at 4:01 PM, Alberto Garcia wrote: > It actually works if i'm in a shell but not if I'm debugging it with dbg. > It still shows the same error when executing main.cc:759 --> add_input_file("base/init-bare.bro"); > > Is there anything I need to execute from within dbg to set the environment variables while using gdb? no. How exactly are you running gdb? source build/bro-path-dev.sh gdb `which bro` should work just fine. -- - Justin Azoff From agarciaillera at gmail.com Thu Jan 19 14:47:36 2017 From: agarciaillera at gmail.com (Alberto Garcia) Date: Thu, 19 Jan 2017 16:47:36 -0600 Subject: [Bro-Dev] fatal error: can't find base/init-bare.bro In-Reply-To: <5937C78F-3016-4D11-B0EF-E203F33EB862@illinois.edu> References: <20170119204628.hmdwyy4iwsgagld4@wifi218.sys.ICSI.Berkeley.EDU> <5937C78F-3016-4D11-B0EF-E203F33EB862@illinois.edu> Message-ID: I'm doing: gdb --interpreter mi --args "/home/default/bro/build/src/bro" -r /tmp/arp_l2tpv3.cap I'm using VisualGDB (http://visualgdb.com/?features=linux) to debug it from Visual Studio as I've done with tons of other projects. It just execute the gdb commands over SSH So its basically the same than executing them from a ssh session. On Thu, Jan 19, 2017 at 4:40 PM, Azoff, Justin S wrote: > > > On Jan 19, 2017, at 5:32 PM, Alberto Garcia > wrote: > > > > I've tried running: > > > > set exec-wrapper bash -c 'source /home/default/bro/build/bro- > path-dev.sh' > > within GDB but it didn't work either. > > > > On Thu, Jan 19, 2017 at 4:01 PM, Alberto Garcia > wrote: > > It actually works if i'm in a shell but not if I'm debugging it with dbg. > > It still shows the same error when executing main.cc:759 --> > add_input_file("base/init-bare.bro"); > > > > Is there anything I need to execute from within dbg to set the > environment variables while using gdb? > > no. How exactly are you running gdb? > > source build/bro-path-dev.sh > gdb `which bro` > > should work just fine. > > -- > - Justin Azoff > > -- Alberto Garc?a Illera GPG Public Key: https://goo.gl/twKUUv -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.icsi.berkeley.edu/pipermail/bro-dev/attachments/20170119/a1dd3c2c/attachment.html From asharma at lbl.gov Thu Jan 19 17:54:33 2017 From: asharma at lbl.gov (Aashish Sharma) Date: Thu, 19 Jan 2017 17:54:33 -0800 Subject: [Bro-Dev] help Reading the backtrace In-Reply-To: <20170119175540.GE45500@icir.org> References: <20170118172912.GB92684@mac-822.local> <703ccb53-c8e8-c47c-6d10-764a47ced599@gmail.com> <20170118184037.GC92684@mac-822.local> <80444C79-3514-404A-BB04-0FF39E3E3108@illinois.edu> <976360f6-7a96-6f52-80b3-7bfc1570715f@gmail.com> <20170119174407.GA28308@mac-822.local> <20170119175540.GE45500@icir.org> Message-ID: <20170120015432.GM86567@mac-822.local> SO this doesn't (at the moment) seem to be related to table expiration. My table is maintained on manager and expire_func only runs on manager. But, I see 'a' worker stall with 99-100% CPU for a good while while all other workers go down to 5-6% CPU. conn.log continues to grow though GDB points to : Get() in install/bro-2.5/src/threading/Queue.h : template inline T Queue::Get() { safe_lock(&mutex[read_ptr]); int old_read_ptr = read_ptr; if ( messages[read_ptr].empty() && ! ((reader && reader->Killed()) || (writer && writer->Killed())) ) { struct timespec ts; ts.tv_sec = time(0) + 5; ts.tv_nsec = 0; -----> pthread_cond_timedwait(&has_data[read_ptr], &mutex[read_ptr], &ts); On a side note, Well, why is this on bro-dev ? Not entirely sure. :) I think eventually this might go into what my script is messing up and whats a better way to script the code, I suppose. Aashish On Thu, Jan 19, 2017 at 09:55:40AM -0800, Robin Sommer wrote: > > > On Thu, Jan 19, 2017 at 09:44 -0800, you wrote: > > > Still, to clearify, there might be a possibility that because at > > present table_incremental_step=5000, somehow expiring >> 5000 entries > > continiously every moment might cause cause Queue to deadlock > > resulting in BRO to stop packets processing ? > > It shouldn't deadlock. What I can see happening, depending on load and > these parameters, is Bro spending most of its time going through the > table to expire entries and only getting to few packets in between (so > not complete stop of processing, but not getting much done either) > > Robin > > -- > Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From robin at icir.org Fri Jan 20 08:01:32 2017 From: robin at icir.org (Robin Sommer) Date: Fri, 20 Jan 2017 08:01:32 -0800 Subject: [Bro-Dev] help Reading the backtrace In-Reply-To: <20170120015432.GM86567@mac-822.local> References: <20170118172912.GB92684@mac-822.local> <703ccb53-c8e8-c47c-6d10-764a47ced599@gmail.com> <20170118184037.GC92684@mac-822.local> <80444C79-3514-404A-BB04-0FF39E3E3108@illinois.edu> <976360f6-7a96-6f52-80b3-7bfc1570715f@gmail.com> <20170119174407.GA28308@mac-822.local> <20170119175540.GE45500@icir.org> <20170120015432.GM86567@mac-822.local> Message-ID: <20170120160132.GO45500@icir.org> On Thu, Jan 19, 2017 at 17:54 -0800, you wrote: > -----> pthread_cond_timedwait(&has_data[read_ptr], &mutex[read_ptr], &ts); Just be sure: are you sure this is the troublesome spot? I'm asking because this is likely running inside a logging thread, and expected to block frequently if there's nothing to log (remember we have one logging thread per output file, so for any low-volume log it'll block regularly). Have you tried switching to other threads in GDB to see where they are at? Also, at that location in gdb above, try to figure out what the queue is for: you should be able to get the name of the thread through 'this->reader->name' (haven't tried that, just took a quick look at the code). Robin -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From robin at icir.org Fri Jan 20 08:06:16 2017 From: robin at icir.org (Robin Sommer) Date: Fri, 20 Jan 2017 08:06:16 -0800 Subject: [Bro-Dev] Testing and Docs for Packages In-Reply-To: References: <81C2CC67-91AB-4674-A633-5E5658CEDA97@illinois.edu> <20170116204729.5aenjl3ts2idhmi6@Beezling.local> <59FDA987-0A43-4491-914B-2F30BE41E551@illinois.edu> <20170118162808.GD80113@icir.org> Message-ID: <20170120160616.GP45500@icir.org> On Thu, Jan 19, 2017 at 17:38 +0000, you wrote: > 1) separate testing environment for each package > 2) single testing environment for all packages > Neither seems great. I guess I plan to do (1) since it is easier on > package authors and less likely to waste users time looking into > harmless test failures Yeah, (1) makes sense most to me too. Otherwise the author of the package, when writing tests, would have to shoot for a moving target that he doesn't control. I think we have to accept that tests won't be able to cover every possible Bro installation; they are a first line of defense against making sure nothing fundamentally broken. Said differently, the tests generally cannot answer the question "will this bro-pkg operation break my setup"; what they answer is "is this package ok?". Robin -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From johanna at icir.org Fri Jan 20 08:54:30 2017 From: johanna at icir.org (Johanna Amann) Date: Fri, 20 Jan 2017 08:54:30 -0800 Subject: [Bro-Dev] Testing and Docs for Packages In-Reply-To: <59FDA987-0A43-4491-914B-2F30BE41E551@illinois.edu> References: <81C2CC67-91AB-4674-A633-5E5658CEDA97@illinois.edu> <20170116204729.5aenjl3ts2idhmi6@Beezling.local> <59FDA987-0A43-4491-914B-2F30BE41E551@illinois.edu> Message-ID: <20170120165420.knmbmayelxcfhyjj@wifi218.sys.ICSI.Berkeley.EDU> On Tue, Jan 17, 2017 at 04:01:19AM +0000, Siwek, Jon wrote: > > I actually think it would be neat to do this isolated, especially given > > that this enables testing before installing. > > Not sure I follow. Can you explain further? Sorry - what I meant is that the tests can run before the packages are put into the bro directory, so you can see if they will work with the installed Bro version (or potentially system configuration) before putting the files in. So you can use it as a prerequisite check for installation. The other way round, you have to roll back after putting them already there - unless I misunderstood something. Plus - even in this case, shouldn't you be able to load the user scripts by loading local.bro? Meaning we could even run all the tests twice - once just with the default Bro installation, and once with the user changes, both before installing the scripts, which could even give an indication if Bro or other packages are at fault. > > It also makes it easier to create something like "smokers" (Bro > > installations that just tro tu run all testsuites of all available > > packages with a newer version to see if something went wrong). > > Can you also go into more detail on what you?re thinking there? > > If there's concerns about accidentally corrupting an existing/production > bro installation, the alternative I?d suggest would be to set up a > separate bro-pkg config file for the smoke tests that would have bro-pkg > install stuff in an isolated location. This allows users to explicitly > define the testing sandbox for themselves. No, the idea would be more along the lines that, in this case, you might actually never want to really install the package; you just want to see if the tests can pass. Though, admittedly, this can once again be accomplished by just immediately uninstalling afterwards. Johanna From jsiwek at illinois.edu Tue Jan 24 18:23:57 2017 From: jsiwek at illinois.edu (Siwek, Jon) Date: Wed, 25 Jan 2017 02:23:57 +0000 Subject: [Bro-Dev] bro-pkg 1.0 available Message-ID: bro-pkg 1.0 is now out and supports * package unit testing [1] * package dependencies [2] I have no remaining major features planned, hence the 1.0. Hope it works well for everyone. - Jon [1] http://bro-package-manager.readthedocs.io/en/stable/package.html#test-command [2] http://bro-package-manager.readthedocs.io/en/stable/package.html#depends From robin at icir.org Wed Jan 25 08:05:35 2017 From: robin at icir.org (Robin Sommer) Date: Wed, 25 Jan 2017 08:05:35 -0800 Subject: [Bro-Dev] btest test failure (Re: Build failed in Jenkins: BTestUnitTests) #2386 In-Reply-To: <1120874934.3.1485342448733.JavaMail.jenkins@brotestbed.ncsa.illinois.edu> References: <1120874934.3.1485342448733.JavaMail.jenkins@brotestbed.ncsa.illinois.edu> Message-ID: <20170125160534.GA57768@icir.org> On Wed, Jan 25, 2017 at 05:07 -0600, jenkins at brotestbed.ncsa.illinois.edu wrote: > tests.progress ... failed This new test is failing on some Jenkins nodes, but I cannot reproduce it locally (tried on Linux and Mac). Is anybody else seeing this? Robin -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From dnthayer at illinois.edu Wed Jan 25 08:23:54 2017 From: dnthayer at illinois.edu (Daniel Thayer) Date: Wed, 25 Jan 2017 10:23:54 -0600 Subject: [Bro-Dev] btest test failure (Re: Build failed in Jenkins: BTestUnitTests) #2386 In-Reply-To: <20170125160534.GA57768@icir.org> References: <1120874934.3.1485342448733.JavaMail.jenkins@brotestbed.ncsa.illinois.edu> <20170125160534.GA57768@icir.org> Message-ID: <3b5f4f2f-f9cd-fcf8-4f65-fb9188f86aea@illinois.edu> On 1/25/17 10:05 AM, Robin Sommer wrote: > > > On Wed, Jan 25, 2017 at 05:07 -0600, jenkins at brotestbed.ncsa.illinois.edu wrote: > >> tests.progress ... failed > > This new test is failing on some Jenkins nodes, but I cannot reproduce > it locally (tried on Linux and Mac). Is anybody else seeing this? > > Robin The test fails only on FreeBSD. If you remove the path to bash in the test file, then the test works on FreeBSD. -Daniel From robin at icir.org Wed Jan 25 08:29:18 2017 From: robin at icir.org (Robin Sommer) Date: Wed, 25 Jan 2017 08:29:18 -0800 Subject: [Bro-Dev] bro-pkg 1.0 available In-Reply-To: References: Message-ID: <20170125162918.GD68079@icir.org> On Wed, Jan 25, 2017 at 02:23 +0000, you wrote: > * package unit testing [1] > * package dependencies [2] Great job, Jon! One of the next steps now is moving our bro-plugins into packages, I'll talk to the maintainers to get that started. Robin -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From johanna at icir.org Fri Jan 27 11:10:46 2017 From: johanna at icir.org (Johanna Amann) Date: Fri, 27 Jan 2017 11:10:46 -0800 Subject: [Bro-Dev] bro-pkg 1.0 available In-Reply-To: References: Message-ID: <20170127191046.kt2ckxieykhaeoxc@wifi222.sys.ICSI.Berkeley.EDU> Hi Jon, On Wed, Jan 25, 2017 at 02:23:57AM +0000, Siwek, Jon wrote: > bro-pkg 1.0 is now out and supports > > * package unit testing [1] thanks for this. Are there any extra steps that one has to do for this to work? I tried to activate it for my repository at https://github.com/0xxon/bro-sumstats-counttable, where the bro-pkg.meta specifies test_command = cd testing && btest -d However, bro-pkg (version 1.0) seems to just ignore this: $ bro-pkg install bro-sumstats-counttable The following packages will be INSTALLED: bro/0xxon/bro-sumstats-counttable (0.0.2) Proceed? [Y/n] y Running unit tests for "bro/0xxon/bro-sumstats-counttable" error: failed to run tests for bro/0xxon/bro-sumstats-counttable: Package does not specify a test_command Proceed to install anyway? [Y/n] n Am I doing something wrong here? Or is there a problem with the way that I specify test_command? (The error message seems to indicate that it is just not being identified though). Johanna From johanna at icir.org Fri Jan 27 11:14:18 2017 From: johanna at icir.org (Johanna Amann) Date: Fri, 27 Jan 2017 11:14:18 -0800 Subject: [Bro-Dev] bro-pkg 1.0 available In-Reply-To: <20170127191046.kt2ckxieykhaeoxc@wifi222.sys.ICSI.Berkeley.EDU> References: <20170127191046.kt2ckxieykhaeoxc@wifi222.sys.ICSI.Berkeley.EDU> Message-ID: <20170127191418.ztrkjjlbnnszdzhc@wifi222.sys.ICSI.Berkeley.EDU> Ah, and if you remember to specify --version master, things suddenly look much better - ignore this :) Johanna On Fri, Jan 27, 2017 at 11:10:46AM -0800, Johanna Amann wrote: > Hi Jon, > > On Wed, Jan 25, 2017 at 02:23:57AM +0000, Siwek, Jon wrote: > > bro-pkg 1.0 is now out and supports > > > > * package unit testing [1] > > thanks for this. Are there any extra steps that one has to do for this to > work? I tried to activate it for my repository at > https://github.com/0xxon/bro-sumstats-counttable, where the bro-pkg.meta > specifies > > test_command = cd testing && btest -d > > However, bro-pkg (version 1.0) seems to just ignore this: > > $ bro-pkg install bro-sumstats-counttable > The following packages will be INSTALLED: > bro/0xxon/bro-sumstats-counttable (0.0.2) > > Proceed? [Y/n] y > Running unit tests for "bro/0xxon/bro-sumstats-counttable" > error: failed to run tests for bro/0xxon/bro-sumstats-counttable: > Package does not specify a test_command > Proceed to install anyway? [Y/n] n > > Am I doing something wrong here? Or is there a problem with the way that I > specify test_command? (The error message seems to indicate that it is just > not being identified though). > > Johanna > _______________________________________________ > bro-dev mailing list > bro-dev at bro.org > http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev > From johanna at icir.org Fri Jan 27 11:34:16 2017 From: johanna at icir.org (Johanna Amann) Date: Fri, 27 Jan 2017 11:34:16 -0800 Subject: [Bro-Dev] bro-pkg 1.0 available In-Reply-To: <20170127191418.ztrkjjlbnnszdzhc@wifi222.sys.ICSI.Berkeley.EDU> References: <20170127191046.kt2ckxieykhaeoxc@wifi222.sys.ICSI.Berkeley.EDU> <20170127191418.ztrkjjlbnnszdzhc@wifi222.sys.ICSI.Berkeley.EDU> Message-ID: <20170127193416.wqjzoeertryxvb4m@wifi222.sys.ICSI.Berkeley.EDU> And - second followup - this time I think I am doing things right this time. On os-x, when trying to install using bro-pkg, I get the following output: $ bro-pkg install bro-sumstats-counttable --version master The following packages will be INSTALLED: bro/0xxon/bro-sumstats-counttable (master) Proceed? [Y/n] y Running unit tests for "bro/0xxon/bro-sumstats-counttable" Traceback (most recent call last): File "/Users/johanna/venv/bin/bro-pkg", line 1635, in main() File "/Users/johanna/venv/bin/bro-pkg", line 1631, in main args.run_cmd(manager, args, config) File "/Users/johanna/venv/bin/bro-pkg", line 314, in cmd_install error, passed, test_dir = manager.test(name, version) File "/Users/johanna/venv/lib/python2.7/site-packages/bropkg/manager.py", line 1622, in test bropath = os.path.dirname(stage_script_dir) + ':' + bropath TypeError: coercing to Unicode: need string or buffer, NoneType found The same happens with your bro-test-package. Is there anything obvious that I am doing wrong? Johanna On Fri, Jan 27, 2017 at 11:14:18AM -0800, Johanna Amann wrote: > Ah, and if you remember to specify --version master, things suddenly look > much better - ignore this :) > > Johanna > > On Fri, Jan 27, 2017 at 11:10:46AM -0800, Johanna Amann wrote: > > Hi Jon, > > > > On Wed, Jan 25, 2017 at 02:23:57AM +0000, Siwek, Jon wrote: > > > bro-pkg 1.0 is now out and supports > > > > > > * package unit testing [1] > > > > thanks for this. Are there any extra steps that one has to do for this to > > work? I tried to activate it for my repository at > > https://github.com/0xxon/bro-sumstats-counttable, where the bro-pkg.meta > > specifies > > > > test_command = cd testing && btest -d > > > > However, bro-pkg (version 1.0) seems to just ignore this: > > > > $ bro-pkg install bro-sumstats-counttable > > The following packages will be INSTALLED: > > bro/0xxon/bro-sumstats-counttable (0.0.2) > > > > Proceed? [Y/n] y > > Running unit tests for "bro/0xxon/bro-sumstats-counttable" > > error: failed to run tests for bro/0xxon/bro-sumstats-counttable: > > Package does not specify a test_command > > Proceed to install anyway? [Y/n] n > > > > Am I doing something wrong here? Or is there a problem with the way that I > > specify test_command? (The error message seems to indicate that it is just > > not being identified though). > > > > Johanna > > _______________________________________________ > > bro-dev mailing list > > bro-dev at bro.org > > http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev > > From johanna at icir.org Fri Jan 27 11:47:03 2017 From: johanna at icir.org (Johanna Amann) Date: Fri, 27 Jan 2017 11:47:03 -0800 Subject: [Bro-Dev] bro-pkg 1.0 available In-Reply-To: <20170127193416.wqjzoeertryxvb4m@wifi222.sys.ICSI.Berkeley.EDU> References: <20170127191046.kt2ckxieykhaeoxc@wifi222.sys.ICSI.Berkeley.EDU> <20170127191418.ztrkjjlbnnszdzhc@wifi222.sys.ICSI.Berkeley.EDU> <20170127193416.wqjzoeertryxvb4m@wifi222.sys.ICSI.Berkeley.EDU> Message-ID: <0CC6479E-7C93-443B-B486-6E418EDB68A1@icir.org> And as a followup - this happens because Bro was not in the path. This really should give a nicer error message though (or abort before even trying to install). Johanna On 27 Jan 2017, at 11:34, Johanna Amann wrote: > And - second followup - this time I think I am doing things right this > time. > > On os-x, when trying to install using bro-pkg, I get the following > output: > > $ bro-pkg install bro-sumstats-counttable --version master > The following packages will be INSTALLED: > bro/0xxon/bro-sumstats-counttable (master) > > Proceed? [Y/n] y > Running unit tests for "bro/0xxon/bro-sumstats-counttable" > Traceback (most recent call last): > File "/Users/johanna/venv/bin/bro-pkg", line 1635, in > main() > File "/Users/johanna/venv/bin/bro-pkg", line 1631, in main > args.run_cmd(manager, args, config) > File "/Users/johanna/venv/bin/bro-pkg", line 314, in cmd_install > error, passed, test_dir = manager.test(name, version) > File > "/Users/johanna/venv/lib/python2.7/site-packages/bropkg/manager.py", > line 1622, in test > bropath = os.path.dirname(stage_script_dir) + ':' + bropath > TypeError: coercing to Unicode: need string or buffer, NoneType found > > The same happens with your bro-test-package. > > Is there anything obvious that I am doing wrong? > > Johanna > > On Fri, Jan 27, 2017 at 11:14:18AM -0800, Johanna Amann wrote: >> Ah, and if you remember to specify --version master, things suddenly >> look >> much better - ignore this :) >> >> Johanna >> >> On Fri, Jan 27, 2017 at 11:10:46AM -0800, Johanna Amann wrote: >>> Hi Jon, >>> >>> On Wed, Jan 25, 2017 at 02:23:57AM +0000, Siwek, Jon wrote: >>>> bro-pkg 1.0 is now out and supports >>>> >>>> * package unit testing [1] >>> >>> thanks for this. Are there any extra steps that one has to do for >>> this to >>> work? I tried to activate it for my repository at >>> https://github.com/0xxon/bro-sumstats-counttable, where the >>> bro-pkg.meta >>> specifies >>> >>> test_command = cd testing && btest -d >>> >>> However, bro-pkg (version 1.0) seems to just ignore this: >>> >>> $ bro-pkg install bro-sumstats-counttable >>> The following packages will be INSTALLED: >>> bro/0xxon/bro-sumstats-counttable (0.0.2) >>> >>> Proceed? [Y/n] y >>> Running unit tests for "bro/0xxon/bro-sumstats-counttable" >>> error: failed to run tests for bro/0xxon/bro-sumstats-counttable: >>> Package does not specify a test_command >>> Proceed to install anyway? [Y/n] n >>> >>> Am I doing something wrong here? Or is there a problem with the way >>> that I >>> specify test_command? (The error message seems to indicate that it >>> is just >>> not being identified though). >>> >>> Johanna >>> _______________________________________________ >>> bro-dev mailing list >>> bro-dev at bro.org >>> http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev >>> > _______________________________________________ > bro-dev mailing list > bro-dev at bro.org > http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev From jan.grashoefer at gmail.com Sun Jan 29 14:07:48 2017 From: jan.grashoefer at gmail.com (=?UTF-8?Q?Jan_Grash=c3=b6fer?=) Date: Sun, 29 Jan 2017 23:07:48 +0100 Subject: [Bro-Dev] bro-pkg 1.0 available In-Reply-To: <0CC6479E-7C93-443B-B486-6E418EDB68A1@icir.org> References: <20170127191046.kt2ckxieykhaeoxc@wifi222.sys.ICSI.Berkeley.EDU> <20170127191418.ztrkjjlbnnszdzhc@wifi222.sys.ICSI.Berkeley.EDU> <20170127193416.wqjzoeertryxvb4m@wifi222.sys.ICSI.Berkeley.EDU> <0CC6479E-7C93-443B-B486-6E418EDB68A1@icir.org> Message-ID: And another small thing: As build_command and test_command may contain anything, another warning would be good in case the package metadata specifies a command. Jan From robin at icir.org Tue Jan 31 14:41:45 2017 From: robin at icir.org (Robin Sommer) Date: Tue, 31 Jan 2017 14:41:45 -0800 Subject: [Bro-Dev] Broker's remote logging (BIT-1784) Message-ID: <20170131224145.GH77787@icir.org> Taking from ticket to the mailing list as I'm looking for some input. https://bro-tracker.atlassian.net/browse/BIT-1784 says: > The change from the older communication code is that > RemoteSerializer::ProcessLogWrite used to do > > success = log_mgr->Write(id_val, writer_val, path, num_fields, val); > > Where bro_broker::Manager::Process uses > > log_mgr->Write(stream_id->AsEnumVal(), columns->AsRecordVal()); The fact that RemoteSerializer and broker::Manager are calling different Write() functions seems to be a broader issue: we get different semantics that way. For RemoteSerializer, log events and log filters run only on the originating nodes; those guys make all decisions about what's getting logged exactly and they then send that on to the manager, which just writes out the data it receives. With Broker, however, both events and filters run (also) on the manager, so that it's making its own decisions on what to record. The filters can be different on the manager, and they will have access to different state. I'm not sure what approach is better actually, I think the Broker semantics can be both helpful and harmful, depending on use case. In any case, it's a change in semantics compacted to the old communication system, and I'm not sure we want that. I'm wondering if there's a reason that in the Broker case things *have* to be this way. Is there something that prevents the Broker manager from doing the same as the RemoteSerializer? Robin -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin From jazoff at illinois.edu Tue Jan 31 15:41:19 2017 From: jazoff at illinois.edu (Azoff, Justin S) Date: Tue, 31 Jan 2017 23:41:19 +0000 Subject: [Bro-Dev] Broker's remote logging (BIT-1784) In-Reply-To: <20170131224145.GH77787@icir.org> References: <20170131224145.GH77787@icir.org> Message-ID: <2D68E22E-5DA6-40E5-A917-7D5E146DF1AE@illinois.edu> > On Jan 31, 2017, at 5:41 PM, Robin Sommer wrote: > ... > The fact that RemoteSerializer and broker::Manager are calling > different Write() functions seems to be a broader issue: we get > different semantics that way. For RemoteSerializer, log events and log > filters run only on the originating nodes; those guys make all > decisions about what's getting logged exactly and they then send that > on to the manager, which just writes out the data it receives. With > Broker, however, both events and filters run (also) on the manager, so > that it's making its own decisions on what to record. The filters can > be different on the manager, and they will have access to different > state. > > I'm not sure what approach is better actually, I think the Broker > semantics can be both helpful and harmful, depending on use case. In > any case, it's a change in semantics compacted to the old > communication system, and I'm not sure we want that. I think we want the old behavior for 2 reasons: 1. The workers only send the &log fields to the managers, so the events are raised with half the fields missing. 2. Having the logger node be as much of a dumb byte mover as possible is best for performance reasons. Having the log events and log filters run on the workers lets that functionality scale out across the nodes. Especially if a filter is used that would remove a large percent of the entries. If someone really wanted the log_* events to run on the manager, they could redef Cluster::worker2manager_events right? > I'm wondering if there's a reason that in the Broker case things > *have* to be this way. Is there something that prevents the Broker > manager from doing the same as the RemoteSerializer? > Jon would know best, but I'd guess one form was more convenient to use than the other and it may have been assumed that they both did the same thing. -- - Justin Azoff From commike at reservoir.com Tue Jan 31 16:06:30 2017 From: commike at reservoir.com (Alan Commike) Date: Wed, 01 Feb 2017 00:06:30 +0000 Subject: [Bro-Dev] Broker's remote logging (BIT-1784) In-Reply-To: <2D68E22E-5DA6-40E5-A917-7D5E146DF1AE@illinois.edu> References: <20170131224145.GH77787@icir.org> <2D68E22E-5DA6-40E5-A917-7D5E146DF1AE@illinois.edu> Message-ID: On Tue, Jan 31, 2017 at 3:48 PM Azoff, Justin S wrote: > > 2. Having the logger node be as much of a dumb byte mover as possible is > best for performance reasons. Having the log events and log filters run on > the workers lets that functionality scale out across the nodes. Especially > if a filter is used that would remove a large percent of the entries. > This. Especially over time as we see more and more cores per processor, it's best to distribute the processing load. By putting the filter in the logger, the logger will then need to enter the interpreter for each log message to determine if it needs to throw away data it just received. That's expensive and limits scalability on multiple fronts. ...alan > > If someone really wanted the log_* events to run on the manager, they > could redef Cluster::worker2manager_events right? > > > I'm wondering if there's a reason that in the Broker case things > > *have* to be this way. Is there something that prevents the Broker > > manager from doing the same as the RemoteSerializer? > > > > Jon would know best, but I'd guess one form was more convenient to use > than the other and it may have been assumed that they both did the same > thing. > > > -- > - Justin Azoff > > > > _______________________________________________ > bro-dev mailing list > bro-dev at bro.org > http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.icsi.berkeley.edu/pipermail/bro-dev/attachments/20170201/d832fb37/attachment.html From vallentin at icir.org Tue Jan 31 18:44:50 2017 From: vallentin at icir.org (Matthias Vallentin) Date: Tue, 31 Jan 2017 18:44:50 -0800 Subject: [Bro-Dev] Broker's remote logging (BIT-1784) In-Reply-To: <20170131224145.GH77787@icir.org> References: <20170131224145.GH77787@icir.org> Message-ID: <20170201024450.GM75260@shogun.local> > I'm wondering if there's a reason that in the Broker case things > *have* to be this way. Is there something that prevents the Broker > manager from doing the same as the RemoteSerializer? Some background: when Broker sends to a log topic, the message has the structure of a pair (id, (x, y, z, ..)) where id is an enum with the log stream name and (x, y, z, ...) a record of log columns. Therefore, in broker::Manager::Process() where messages are parsed and dispatched, the log messages go into logging::Manager::Write(EnumVal*, RecordVal*). Such messages get created via broker::Manager::Log(EnumVal*, RecordVal*, RecordType*). The only caller of this function is logging::Manager. Purely from an API perspective, could we just move the call from one Write() function to the other? Matthias From robin at icir.org Tue Jan 31 20:05:02 2017 From: robin at icir.org (Robin Sommer) Date: Tue, 31 Jan 2017 20:05:02 -0800 Subject: [Bro-Dev] Broker's remote logging (BIT-1784) In-Reply-To: <20170201024450.GM75260@shogun.local> References: <20170131224145.GH77787@icir.org> <20170201024450.GM75260@shogun.local> Message-ID: <20170201040502.GB783@icir.org> On Tue, Jan 31, 2017 at 18:44 -0800, you wrote: > Purely from an API perspective, could we just move the call from one > Write() function to the other? I think the answer is yes, but I've looked at bit more at the code and I think I see where the challenge is: that 2nd Write() method (the one the RemoteSerializer is using to output already filtered logs) takes threading::Values, not Vals. That means switching over from one Write() to the other isn't straight-forward because we don't have code that sends threading::Values over the Broker connection. We could convert the Val's into threading::Values once received, but that'd be kind of odd:: I'm pretty sure the distinction was due to threading::Values being quite a bit more efficient to send. It should be pretty straight-forward to add the necessary threading::Value-to-Broker conversion code (just a bit tedious :). I'll look into that. Robin -- Robin Sommer * ICSI/LBNL * robin at icir.org * www.icir.org/robin