[Bro-Dev] Final Broker branch testing

Azoff, Justin S jazoff at illinois.edu
Thu Apr 26 13:25:15 PDT 2018

> On Apr 26, 2018, at 11:16 AM, Jon Siwek <jsiwek at corelight.com> wrote:
> The latest version of the new Broker-ized cluster/communication system 
> for Bro in 'topic/actor-system' branch is wrapping up and, in my 
> opinion, ready to be merged into Bro's 'master' branch.
> Though, for this round of testing, I'd be most interested just in any 
> general stability issues or major feature breakages from a vanilla Bro 
> installation.  Mild performance issues, minor bugs, or other issues w/ 
> porting custom scripts are things I think we can iron out even after 
> merging into 'master'.
> - Jon

I threw this on our test cluster, and whatever that issue was with rotation breaking causing the logger
to buffer and the OOM is fixed now.. logs have rotated twice now without issue.

cpu usage is still higher, but I think it is just busy waiting like you suggested.. perf top on a proxy shows:

   5.32%  [kernel]                  [k] system_call_after_swapgs
   3.48%  libcaf_core.so.0.15.7     [.] caf::scheduler::worker<caf::policy::work_stealing>::run
   3.12%  libc-2.17.so              [.] __GI___libc_nanosleep
   3.10%  [kernel]                  [k] sysret_check
   3.05%  libcaf_core.so.0.15.7     [.] caf::detail::double_ended_queue<caf::resumable>::take_head
   2.61%  [kernel]                  [k] __schedule
   2.20%  libc-2.17.so              [.] __sleep
   2.19%  [kernel]                  [k] timerqueue_add
   2.06%  [kernel]                  [k] __audit_syscall_exit
   1.89%  [kernel]                  [k] native_write_msr_safe
   1.85%  [kernel]                  [k] cpuacct_charge
   1.84%  [kernel]                  [k] __audit_syscall_entry
   1.74%  [kernel]                  [k] hrtimer_start_range_ns
   1.50%  libstdc++.so.6.0.19       [.] std::this_thread::__sleep_for
   1.40%  libc-2.17.so              [.] __libc_disable_asynccancel
   1.37%  [kernel]                  [k] _raw_spin_unlock_irqrestore
   1.37%  [kernel]                  [k] do_nanosleep
   1.25%  libc-2.17.so              [.] usleep
   1.22%  [kernel]                  [k] rb_insert_color
   1.20%  [kernel]                  [k] update_curr
   1.18%  [kernel]                  [k] idle_cpu
   1.14%  [kernel]                  [k] copy_user_generic_string
   1.09%  [kernel]                  [k] finish_task_switch
   1.07%  [kernel]                  [k] __x86_indirect_thunk_rax
   1.06%  [kernel]                  [k] ktime_get
   0.93%  [kernel]                  [k] native_sched_clock
   0.92%  [kernel]                  [k] sys_nanosleep

which seems almost entirely related to timers and sleeping.

Other than that things are working great.  Cluster::publish_hrw is distributing data cross proxies perfectly:

# for x in 1 2 3; do broctl print Scan::attacks proxy-$x|grep attempts= -c;done

# cat /bro/logs/current/notice.log |bro-cut note peer_descr|grep Scan::|cut -f 2|sort|uniq  -c
    454 proxy-1
    463 proxy-2
    417 proxy-3

Once this is stable for a bit i'll start trying things like killing a proxy and verifying that things failover.

Justin Azoff

More information about the bro-dev mailing list