[Zeek] capture_loss vs. pkts_dropped vs. missed_bytes

Michał Purzyński michalpurzynski1 at gmail.com
Thu May 2 12:49:33 PDT 2019


Hey Mark,

First of all, I really like your setup and I don't see any obvious errors
there. Cool.

Jan (also on this list) might know more about the way drops are calculated
in stats log. It looks like they are just af_packet statistics.

Can you run Justin's troubleshooting tool and send us results?
https://github.com/ncsa/bro-doctor


BTW, while monitoring for drops, take a look here, where we describe
several other places drops might happen (and all of them should be
monitored).

https://github.com/pevma/SEPTun/blob/master/SEPTun.rst#packet-drops


On Thu, May 2, 2019 at 10:14 AM Mark Gardner <mkg at vt.edu> wrote:

> I am still tuning our new Zeek cluster: an Arista switch for load
> balancing with 4x10 Gbps links from a Gigamon and 10 Gbps links to the
> sensors, five sensors (16 physical cores with 128 GB RAM each) using
> af_packet, 15 workers per sensor, and a separate management node running
> the manager, logger, proxy, and storage (XFS on RAID-0 with 8 7200 RPM
> spindles, 256 GB RAM). Output is JSON (for feeding into an ElasticStack
> later).
>
> The average capture loss was <1%  early on with spikes to 50-70%. We
> increased the af_packet_buffer_size from the default (128MB) to 2GB and
> capture_loss is gone.
> $ zcat capture_loss.10\:00\:00-11\:00\:00.log.gz | jq .percent_lost |
> statgen
>  Count         Min         Max         Avg      StdDev
>    300      0.0000      0.0000      0.0000      0.0000
>
> Next, I looked at the missing bytes from the conn.log which doesn't look
> too bad:
> $ zcat conn.10\:00\:00-11\:00\:00.log.gz | jq .missed_bytes | statgen
>  Count         Min         Max         Avg      StdDev
>   5488      0.0000   5802.0000      1.7332     92.9547
> Out of the 5488 records, only two were non-zero (5802 and 3710)  and for
> both of those the missed_bytes == resp_bytes (service: ssl).
>
> But even with the above, the pkts_dropped in stats.log is extremely high:
> $ zcat stats.10\:00\:00-11\:00\:00.log.gz | jq .pkts_dropped | grep -v
> null | statgen
>  Count         Min         Max         Avg      StdDev
>    900     3564854    18216752  5762446.99  1591145.34
>
> So even though there was no capture_loss and almost no missing_bytes, the
> pkts_dropped is huge. Is this something to be concerned about? If so, I am
> not sure how to go about figuring out the problem. What should I do next?
> _______________________________________________
> Zeek mailing list
> zeek at zeek.org
> http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/zeek
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/zeek/attachments/20190502/da11f488/attachment.html 


More information about the Zeek mailing list