[Bro-Dev] [JIRA] (BIT-1353) BroCtl status/top take excessive amount of time
johanna at icir.org
Mon Mar 23 13:49:46 PDT 2015
On Mon, Mar 23, 2015 at 03:33:13PM -0500, Daniel Thayer wrote:
> I'm glad to hear that you're testing broctl on FreeBSD (I always
> test on Linux). Here are my initial ideas:
> How many hosts are in your cluster? (you mentioned "28 physical nodes",
> does that mean 28 computers?!)
It is 28 computers, each running 3 bro worker processes with 2 more
physical machines running the master and proxies.
> Are you running the git master version of broctl?
it is not quite master - it currently is running 5e2defe, so the state as
of March 13th.
> Is every broctl command slow, or just status and top?
All the ones that I tried are slow. I can upgrade to master and test again
- I just wanted to ask if there is some way to debug what is going on
before restarting the cluster, since the problem took a few days to
manifest itself. Hence I probably will not be able to directly reproduce
> The broctl status command usually spends most of its time
> waiting for broccoli. I've added a new option that you
> can set in your etc/broctl.cfg file that will skip
> the broccoli code so that broctl status runs much faster.
> To enable this feature, make sure this line is in your
> broctl.cfg file:
> StatusCmdShowAll = 0
> (after you add this, broctl will say that you have to run
> either "install" or "deploy", but you don't actually
> need to for this particular broctl option).
I added this (without running install / depoloy) and it now is now faster,
but still takes a while. I examined spool/debug.log a bit and it actually
seems that a significant period of time is spent getting the process status.
The timeline currently looks like this:
23 Mar 11:53:05 [broctl] status
23 Mar 11:53:05 [broctl] Getting process status ...
23 Mar 11:53:05 [execute] blade26: /xa/bro/master/share/broctl/scripts/helpers/check-pid 2513
[...] (many lines like this and many exit code lines)
23 Mar 11:54:07 [execute] blade15: exit code 0
23 Mar 11:54:07 [execute] blade26: /xa/bro/master/share/broctl/scripts/helpers/cat-file /xa/bro/master/spool/worker-26-0/.startup
23 Mar 11:54:09 [execute] blade15: exit code 0
23 Mar 11:54:09 [events] broccoli: Control::peer_status_request() to node worker-26-0
23 Mar 11:54:29 [events] broccoli: Control::peer_status_response(1427136868.812806 [...]
-> status output
More information about the bro-dev