[Zeek] Why does my logger keep crashing - bro version 2.6.3

Kayode Enwerem Kayode_Enwerem at ao.uscourts.gov
Fri Sep 27 09:27:19 PDT 2019


Looks like setting up 2 loggers resolved the issue of my logger crashing but my Dropped packets are pretty high on my workers. Can someone assist me with how I can reduce my dropped packets.

cat capture_loss.log
#separator \x09
#set_separator  ,
#empty_field    (empty)
#unset_field    -
#path   capture_loss
#open   2019-09-27-12-05-05
#fields ts      ts_delta        peer    gaps    acks    percent_lost
#types  time    interval        string  count   count   double
1569600304.774215       900.000013      worker-1-1      126463  3246542 3.895314
1569600304.783703       900.000064      worker-1-3      106904  4465333 2.394088
1569600304.785983       900.000212      worker-1-11     123729  3768503 3.28324
1569600304.802244       900.000098      worker-1-14     144154  3584013 4.022139
1569600304.823378       900.000095      worker-1-18     137507  3503583 3.924754
1569600304.892559       900.000470      worker-1-13     148904  3448544 4.31788
1569600305.010986       900.000030      worker-1-8      174213  3409819 5.109157
1569600305.938686       901.043465      worker-1-15     509268  1072199 47.497526
1569600304.806850       900.000047      worker-1-22     591232  1234893 47.877185
1569601204.762382       900.000786      worker-1-16     120086  4491072 2.673883
1569601204.774220       900.000005      worker-1-1      127257  3461349 3.676515
1569601204.802447       900.000203      worker-1-14     125481  3171663 3.956316
1569601204.884438       900.000029      worker-1-19     125037  3566663 3.505714
1569601204.891746       900.000015      worker-1-23     120553  3078889 3.915471
1569601205.110098       900.000139      worker-1-10     108016  3442813 3.137434
1569601205.938906       900.000220      worker-1-15     565536  1156759 48.8897
1569601218.120290       900.000047      worker-1-6      456312  753749  60.538986

Below are some of my settings:

I have 23 workers defined and I pinned CPU.
[worker-1]
type=worker
host=localhost
interface=af_packet::ens2f0
lb_method=custom
#lb_method=pf_ring
lb_procs=23
pin_cpus=5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27
af_packet_fanout_id=25
af_packet_fanout_mode=AF_Packet::FANOUT_HASH

Can someone assist me with this.

Thanks.


From: william de ping <bill.de.ping at gmail.com>
Sent: Wednesday, September 25, 2019 4:00 AM
To: Kayode Enwerem <Kayode_Enwerem at ao.uscourts.gov>
Cc: zeek at zeek.org
Subject: Re: [Zeek] Why does my logger keep crashing - bro version 2.6.3

Hi

Try using the None writer instead of the ASCII one.
In local.bro add :
redef Log::default_writer=Log::WRITER_NONE;

If the logger instance still crashes then the issue is not related to an IO bottleneck.

B

On Tue, Sep 24, 2019 at 7:49 PM Kayode Enwerem <Kayode_Enwerem at ao.uscourts.gov<mailto:Kayode_Enwerem at ao.uscourts.gov>> wrote:
Thanks for your response.

I do see the following OOM message in my system logs on the logger process ID:
Sep 23 18:48:00 kernel: Out of memory: Kill process 10439 (bro) score 787 or sacrifice child
Sep 23 18:48:00 kernel: Killed process 10439 (bro), UID 0, total-vm:301983900kB, anon-rss:195261772kB, file-rss:2592kB, shmem-rss:0kB

Wonder why its taking so much memory, I have 251G and 99G swap on this server.
total        used        free      shared  buff/cache   available
Mem:           251G         66G        185G        4.2M        488M        184G
Swap:           99G        1.1G         98G

Below is the output of "broctl diag logger", ran after the logger crashed.

 [logger]

No core file found.

Bro 2.6.3
Linux 3.10.0-1062.1.1.el7.x86_64

Bro plugins:
Bro::AF_Packet - Packet acquisition via AF_Packet (dynamic, version 1.4)

==== No reporter.log

==== stderr.log
/usr/local/bro/share/broctl/scripts/run-bro: line 110: 10439 Killed                  nohup "$mybro" "$@"

==== stdout.log
max memory size         (kbytes, -m) unlimited
data seg size           (kbytes, -d) unlimited
virtual memory          (kbytes, -v) unlimited
core file size          (blocks, -c) unlimited

==== .cmdline
-U .status -p broctl -p broctl-live -p local -p logger local.bro broctl base/frameworks/cluster broctl/auto

==== .env_vars
PATH=/usr/local/bro/bin:/usr/local/bro/share/broctl/scripts:/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bro/bin
BROPATH=/logs/bro/spool/installed-scripts-do-not-touch/site::/logs/bro/spool/installed-scripts-do-not-touch/auto:/usr/local/bro/share/bro:/usr/local/bro/share/bro/policy:/usr/local/bro/share/bro/site
CLUSTER_NODE=logger

==== .status
RUNNING [net_run]

==== No prof.log

==== No packet_filter.log

==== No loaded_scripts.log

Thoughts? Any suggestions.

-----Original Message-----
From: Vlad Grigorescu <vlad at es.net<mailto:vlad at es.net>>
Sent: Monday, September 23, 2019 10:20 AM
To: Kayode Enwerem <Kayode_Enwerem at ao.uscourts.gov<mailto:Kayode_Enwerem at ao.uscourts.gov>>
Cc: william de ping <bill.de.ping at gmail.com<mailto:bill.de.ping at gmail.com>>; zeek at zeek.org<mailto:zeek at zeek.org>
Subject: Re: [Zeek] Why does my logger keep crashing - bro version 2.6.3

The logger is threaded, so seeing CPU > 100% is not necessarily a problem.

Have you tried running "broctl diag logger" to see why the logger is crashing? Do you have any messages in your system logs about processing being killed for out of memory (OOM)?

  --Vlad

On Mon, Sep 23, 2019 at 1:32 PM Kayode Enwerem <Kayode_Enwerem at ao.uscourts.gov<mailto:Kayode_Enwerem at ao.uscourts.gov>> wrote:
>
> Thanks for your response. The CPU usage for the logger is at 311%. (look below).
>
>
>
> broctl top
>
> Name         Type    Host             Pid     VSize  Rss  Cpu   Cmd
>
> logger       logger  localhost        22867    12G     9G 311%  bro
>
>
>
> I wasn’t aware that you could set up multiple loggers, I tried checking the docs to see if that was an option. Does anyone know how to do this?
>
>
>
> From: william de ping <bill.de.ping at gmail.com<mailto:bill.de.ping at gmail.com>>
> Sent: Sunday, September 22, 2019 6:42 AM
> To: Kayode Enwerem <Kayode_Enwerem at ao.uscourts.gov<mailto:Kayode_Enwerem at ao.uscourts.gov>>
> Cc: zeek at zeek.org<mailto:zeek at zeek.org>
> Subject: Re: [Zeek] Why does my logger keep crashing - bro version
> 2.6.3
>
>
>
> Hi,
>
>
>
> I would try to monitor the cpu \ mem usage of the logger instance.
>
> Try running broctl top, my guess is that you will see that the logger process will have a very high cpu usage.
>
>
>
> I know of an option to have multiple loggers but I am not sure how to set it up.
>
>
>
> Are you writing to a file ?
>
>
>
> B
>
>
>
> On Thu, Sep 19, 2019 at 7:14 PM Kayode Enwerem <Kayode_Enwerem at ao.uscourts.gov<mailto:Kayode_Enwerem at ao.uscourts.gov>> wrote:
>
> Hello,
>
>
>
> Why does my logger keep crashing? Can someone please help me with this. I have provided some system information below:
>
>
>
> I am running bro version 2.6.3
>
>
>
> Output of broctl status. The logger is crashed but the manager, proxy and workers are still running.
>
> broctl status
>
> Name         Type    Host             Status    Pid    Started
>
> logger       logger  localhost        crashed
>
> manager      manager localhost        running   17356  09 Sep 15:42:24
>
> proxy-1      proxy   localhost        running   17401  09 Sep 15:42:25
>
> worker-1-1   worker  localhost        running   17573  09 Sep 15:42:27
>
> worker-1-2   worker  localhost        running   17569  09 Sep 15:42:27
>
> worker-1-3   worker  localhost        running   17572  09 Sep 15:42:27
>
> worker-1-4   worker  localhost        running   17587  09 Sep 15:42:27
>
> worker-1-5   worker  localhost        running   17619  09 Sep 15:42:27
>
> worker-1-6   worker  localhost        running   17614  09 Sep 15:42:27
>
> worker-1-7   worker  localhost        running   17625  09 Sep 15:42:27
>
> worker-1-8   worker  localhost        running   17646  09 Sep 15:42:27
>
> worker-1-9   worker  localhost        running   17671  09 Sep 15:42:27
>
> worker-1-10  worker  localhost        running   17663  09 Sep 15:42:27
>
> worker-1-11  worker  localhost        running   17679  09 Sep 15:42:27
>
> worker-1-12  worker  localhost        running   17685  09 Sep 15:42:27
>
> worker-1-13  worker  localhost        running   17698  09 Sep 15:42:27
>
> worker-1-14  worker  localhost        running   17703  09 Sep 15:42:27
>
> worker-1-15  worker  localhost        running   17710  09 Sep 15:42:27
>
> worker-1-16  worker  localhost        running   17717  09 Sep 15:42:27
>
> worker-1-17  worker  localhost        running   17720  09 Sep 15:42:27
>
> worker-1-18  worker  localhost        running   17727  09 Sep 15:42:27
>
> worker-1-19  worker  localhost        running   17728  09 Sep 15:42:27
>
> worker-1-20  worker  localhost        running   17731  09 Sep 15:42:27
>
>
>
> Here’s my node.cfg settings
>
> [logger]
>
> type=logger
>
> host=localhost
>
>
>
> [manager]
>
> type=manager
>
> host=localhost
>
>
>
> [proxy-1]
>
> type=proxy
>
> host=localhost
>
>
>
> [worker-1]
>
> type=worker
>
> host=localhost
>
> interface=af_packet::ens2f0
>
> lb_method=custom
>
> #lb_method=pf_ring
>
> lb_procs=20
>
> pin_cpus=6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25
>
> af_packet_fanout_id=25
>
> af_packet_fanout_mode=AF_Packet::FANOUT_HASH
>
>
>
> Heres more information on my CPU. 32 CPUs, model name – AMD, CPU max
> MHz is 2800.0000
>
> Architecture:          x86_64
>
> CPU op-mode(s):        32-bit, 64-bit
>
> Byte Order:            Little Endian
>
> CPU(s):                32
>
> On-line CPU(s) list:   0-31
>
> Thread(s) per core:    2
>
> Core(s) per socket:    8
>
> Socket(s):             2
>
> NUMA node(s):          4
>
> Vendor ID:             AuthenticAMD
>
> CPU family:            21
>
> Model:                 2
>
> Model name:            AMD Opteron(tm) Processor 6386 SE
>
> Stepping:              0
>
> CPU MHz:               1960.000
>
> CPU max MHz:           2800.0000
>
> CPU min MHz:           1400.0000
>
> BogoMIPS:              5585.93
>
> Virtualization:        AMD-V
>
> L1d cache:             16K
>
> L1i cache:             64K
>
> L2 cache:              2048K
>
> L3 cache:              6144K
>
> NUMA node0 CPU(s):     0,2,4,6,8,10,12,14
>
> NUMA node1 CPU(s):     16,18,20,22,24,26,28,30
>
> NUMA node2 CPU(s):     1,3,5,7,9,11,13,15
>
> NUMA node3 CPU(s):     17,19,21,23,25,27,29,31
>
>
>
> Would also like to know how I can reduce my packet loss. Below is the
> output of broctl netstats
>
> broctl netstats
>
> worker-1-1: 1568908277.861813 recvd=12248845422 dropped=5171188999
> link=17420313882
>
> worker-1-2: 1568908298.313954 recvd=8636707266 dropped=971489
> link=8637678939
>
> worker-1-3: 1568908278.425888 recvd=11684808853 dropped=5617381647
> link=17302473791
>
> worker-1-4: 1568908285.731130 recvd=12567242226 dropped=4339688288
> link=16907212802
>
> worker-1-5: 1568908298.363911 recvd=8620499351 dropped=24595149
> link=8645095758
>
> worker-1-6: 1568908298.372892 recvd=8710565757 dropped=1731022
> link=8712297432
>
> worker-1-7: 1568908298.266010 recvd=9065207444 dropped=53523232
> link=9118737229
>
> worker-1-8: 1568908286.935607 recvd=11377790124 dropped=3680887247
> link=15058934491
>
> worker-1-9: 1568908298.419657 recvd=8931903322 dropped=39696184
> link=8971604219
>
> worker-1-10: 1568908298.478576 recvd=8842874030 dropped=2501252
> link=8845376352
>
> worker-1-11: 1568908298.506649 recvd=8692769329 dropped=2253413
> link=8695025626
>
> worker-1-12: 1568908298.520830 recvd=8749977028 dropped=2314733
> link=8752293714
>
> worker-1-13: 1568908298.544573 recvd=9101243757 dropped=1779460
> link=9103025399
>
> worker-1-14: 1568908291.370011 recvd=10876925726 dropped=775722632
> link=11652810353
>
> worker-1-15: 1568908298.579721 recvd=8503097394 dropped=1420699
> link=8504520066
>
> worker-1-16: 1568908298.594942 recvd=8515164266 dropped=1840977
> link=8517006779
>
> worker-1-17: 1568908298.646966 recvd=10666567717 dropped=466489754
> link=11133059283
>
> worker-1-18: 1568908298.671246 recvd=9023603573 dropped=2037607
> link=9025642263
>
> worker-1-19: 1568908298.704675 recvd=8907784186 dropped=1164594
> link=8908950238
>
> worker-1-20: 1568908298.718084 recvd=9140525444 dropped=2028593
> link=9142555259
>
>
>
> Thanks,
>
> _______________________________________________
> Zeek mailing list
> zeek at zeek.org<mailto:zeek at zeek.org>
> http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/zeek
>
> _______________________________________________
> Zeek mailing list
> zeek at zeek.org<mailto:zeek at zeek.org>
> http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/zeek
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/zeek/attachments/20190927/b6bd6a71/attachment-0001.html 


More information about the Zeek mailing list