[Zeek] Capture packet loss discrepancy

Michał Purzyński michalpurzynski1 at gmail.com
Tue Mar 10 16:14:05 PDT 2020


So it does not look like that's Zeek dropping traffic here. Something else
must be going on.

What's your data rate? What's the MTU?

Something that worries me -

     rx_long_byte_count: 51106679706383

Not sure what these are, as it pretty much always requires reading network
card's driver source (these fields are by no means standarized and even if
they happen to have the same name accross vendors they can mean something
else).

Can you share statistics from the switch? Something like show int <span
port here> stats or similar.

On Fri, Mar 6, 2020 at 2:53 AM Federico Foschini <undicizeri at gmail.com>
wrote:

> I’m also adding my /etc/network/interfaces:
>
> auto eno2
> iface eno2 inet manual
>   up ip link set $IFACE promisc on arp off up
>   down ip link set $IFACE promisc off down
>   post-up ethtool -G $IFACE rx 4096; for i in rx tx sg tso ufo gso gro lro; do ethtool -K $IFACE $i off; done
>   post-up echo 1 > /proc/sys/net/ipv6/conf/$IFACE/disable_ipv6
>   bond-master bond0
>
> auto bond0
> iface bond0 inet manual
>   bond-mode balance-rr
>   bond-slaves eno2
>   mtu 1500
>   up ip link set bond0 promisc on arp off up
>   down ip link set bond0 promisc off down
>   post-up ethtool -G bond0 rx 4096; for i in rx tx sg tso ufo gso gro lro; do ethtool -K bond0 $i off; done
>   post-up echo 1 > /proc/sys/net/ipv6/conf/bond0/disable_ipv6
>
> I’ve configured a bond interface because I’m planning to add another span
> port. I’ve copied this configuration from SecurityOnion github:
> https://github.com/Security-Onion-Solutions/securityonion-saltstack/blob/d3826bc605a22e090bc7cf07309bce76a2f12623/setup/so-functions#L298
>
> Il giorno ven 6 mar 2020 alle ore 11:47 Federico Foschini <
> undicizeri at gmail.com> ha scritto:
>
>> Hello,
>> thank you for your help.
>> I’m configured a span port on the switch.
>>
>> In Zeek I’ve configured the AF_Packet plugin.
>>
>> This is my node.cfg:
>>
>> [manager]
>> type=manager
>> host=localhost
>>
>> [proxy-1]
>> type=proxy
>> host=localhost
>>
>> [worker-1]
>> type=worker
>> host=localhost
>> interface=bond0
>> lb_method=custom
>> lb_procs=1
>> af_packet_fanout_id=21
>> af_packet_fanout_mode=AF_Packet::FANOUT_HASH
>> af_packet_buffer_size=128*1024*1024
>>
>> This is the ethtool output:
>>
>> ethtool -c eno2
>> Coalesce parameters for eno2:
>> Adaptive RX: off  TX: off
>> stats-block-usecs: 0
>> sample-interval: 0
>> pkt-rate-low: 0
>> pkt-rate-high: 0
>>
>> rx-usecs: 3
>> rx-frames: 0
>> rx-usecs-irq: 0
>> rx-frames-irq: 0
>>
>> tx-usecs: 0
>> tx-frames: 0
>> tx-usecs-irq: 0
>> tx-frames-irq: 0
>>
>> rx-usecs-low: 0
>> rx-frame-low: 0
>> tx-usecs-low: 0
>> tx-frame-low: 0
>>
>> rx-usecs-high: 0
>> rx-frame-high: 0
>> tx-usecs-high: 0
>> tx-frame-high: 0
>>
>> ethtool -l eno2
>> Channel parameters for eno2:
>> Pre-set maximums:
>> RX:             0
>> TX:             0
>> Other:          1
>> Combined:       8
>> Current hardware settings:
>> RX:             0
>> TX:             0
>> Other:          1
>> Combined:       8
>>
>> ethtool -x eno2
>> RX flow hash indirection table for eno2 with 8 RX ring(s):
>>     0:      0     0     0     0     0     0     0     0
>>     8:      0     0     0     0     0     0     0     0
>>    16:      1     1     1     1     1     1     1     1
>>    24:      1     1     1     1     1     1     1     1
>>    32:      2     2     2     2     2     2     2     2
>>    40:      2     2     2     2     2     2     2     2
>>    48:      3     3     3     3     3     3     3     3
>>    56:      3     3     3     3     3     3     3     3
>>    64:      4     4     4     4     4     4     4     4
>>    72:      4     4     4     4     4     4     4     4
>>    80:      5     5     5     5     5     5     5     5
>>    88:      5     5     5     5     5     5     5     5
>>    96:      6     6     6     6     6     6     6     6
>>   104:      6     6     6     6     6     6     6     6
>>   112:      7     7     7     7     7     7     7     7
>>   120:      7     7     7     7     7     7     7     7
>> RSS hash key:
>> Operation not supported
>>
>> ethtool -a eno2
>> Pause parameters for eno2:
>> Autonegotiate:  on
>> RX:             on
>> TX:             off
>>
>> ethtool -g eno2
>> Ring parameters for eno2:
>> Pre-set maximums:
>> RX:             4096
>> RX Mini:        0
>> RX Jumbo:       0
>> TX:             4096
>> Current hardware settings:
>> RX:             4096
>> RX Mini:        0
>> RX Jumbo:       0
>> TX:             256
>>
>>
>> Il giorno ven 6 mar 2020 alle ore 11:39 Michał Purzyński <
>> michalpurzynski1 at gmail.com> ha scritto:
>>
>>> You either drop data before it reaches Zeek sensors or you have an
>>> offloading missconfiguration on the card or you have a load-balancing
>>> problem. That might be a good news.
>>>
>>> The capture.log file is based on heuristics "I have seen some ACKs
>>> without matching data packets" - but won't tell you where the problem is.
>>> Justin calls it a "check engine light".
>>>
>>> You don't seem to have drops in stats.log and in ethtool stats. That's
>>> great. Stats.log meaning is different for each capture method and it's
>>> never enough without some ethtool stats.
>>>
>>> Can you tell us more about your setup?
>>> - how are you feeding your sensor (taps, span ports, etc)
>>> - what's your capture mechanism?
>>> - can you share all commands for interface's configuration?
>>> - can you share the node.cfg
>>> - output of
>>>
>>> ethtool -c <interface>
>>> ethtool -l <interface>
>>> ethtool -x <interface>
>>> ethtool -a <interface>
>>> ethtool -g <interface>
>>>
>>> On Fri, Mar 6, 2020 at 2:19 AM Federico Foschini <undicizeri at gmail.com>
>>> wrote:
>>>
>>>> Hello,
>>>> I’m using Zeek 3.0.1 and I’m seeing very high zeek capture loss even if
>>>> the system load is very low (I’m analyzing 50-100mbps of traffic with a
>>>> Xeon 8C-16T and 32GiB of ram, the system load is barely rearching 1).
>>>>
>>>> This is what
>>>> https://docs.zeek.org/en/stable/scripts/policy/misc/capture-loss.zeek.html
>>>> is reporting:
>>>>
>>>> {"_path":"capture_loss","_write_ts":"2020-03-06T10:00:28.080623Z","ts":"2020-03-06T10:00:28.080623Z","ts_delta":900.0000138282776,"peer":"worker-1-1","gaps":980763,"acks":3451717,"percent_lost":28.4137720444
>>>> 6367}
>>>> {"_path":"capture_loss","_write_ts":"2020-03-06T09:45:28.080609Z","ts":"2020-03-06T09:45:28.080609Z","ts_delta":900.0000011920929,"peer":"worker-1-1","gaps":775832,"acks":3944662,"percent_lost":19.6678955002
>>>> 98886}
>>>>
>>>> But in Zeek stats logs I cannot see any drops:
>>>>
>>>> {"_path":"stats","_write_ts":"2020-03-06T10:00:25.184307Z","ts":"2020-03-06T10:00:25.184307Z","peer":"manager","mem":99,"pkts_proc":0,"bytes_recv":0,"events_proc":2171,"events_queued":2175,"active_tcp_conns"
>>>> :0,"active_udp_conns":0,"active_icmp_conns":0,"tcp_conns":0,"udp_conns":0,"icmp_conns":0,"timers":1097,"active_timers":77,"files":0,"active_files":0,"dns_requests":0,"active_dns_requests":0,"reassem_tcp_size
>>>> ":0,"reassem_file_size":0,"reassem_frag_size":0,"reassem_unknown_size":0}
>>>> {"_path":"stats","_write_ts":"2020-03-06T10:00:26.676726Z","ts":"2020-03-06T10:00:26.676726Z","peer":"proxy-1","mem":95,"pkts_proc":0,"bytes_recv":0,"events_proc":1579,"events_queued":1579,"active_tcp_conns"
>>>> :0,"active_udp_conns":0,"active_icmp_conns":0,"tcp_conns":0,"udp_conns":0,"icmp_conns":0,"timers":939,"active_timers":39,"files":0,"active_files":0,"dns_requests":0,"active_dns_requests":0,"reassem_tcp_size"
>>>> :0,"reassem_file_size":0,"reassem_frag_size":0,"reassem_unknown_size":0}
>>>> {"_path":"stats","_write_ts":"2020-03-06T10:00:28.087807Z","ts":"2020-03-06T10:00:28.087807Z","peer":"worker-1-1","mem":494,"pkts_proc":4342283,"bytes_recv":3395272841,"pkts_dropped":0,"pkts_link":4347989,"p
>>>> kt_lag":0.010818004608154297,"events_proc":894955,"events_queued":894955,"active_tcp_conns":2944,"active_udp_conns":473,"active_icmp_conns":70,"tcp_conns":9036,"udp_conns":7376,"icmp_conns":383,"timers":2168
>>>> 06,"active_timers":11468,"files":10693,"active_files":10,"dns_requests":2,"active_dns_requests":0,"reassem_tcp_size":500680,"reassem_file_size":451064,"reassem_frag_size":0,"reassem_unknown_size":0}
>>>> {"_path":"stats","_write_ts":"2020-03-06T10:05:25.185129Z","ts":"2020-03-06T10:05:25.185129Z","peer":"manager","mem":99,"pkts_proc":0,"bytes_recv":0,"events_proc":3573,"events_queued":3569,"active_tcp_conns"
>>>> :0,"active_udp_conns":0,"active_icmp_conns":0,"tcp_conns":0,"udp_conns":0,"icmp_conns":0,"timers":1063,"active_timers":77,"files":0,"active_files":0,"dns_requests":0,"active_dns_requests":0,"reassem_tcp_size
>>>> ":0,"reassem_file_size":0,"reassem_frag_size":0,"reassem_unknown_size":0}
>>>> {"_path":"stats","_write_ts":"2020-03-06T10:05:26.677296Z","ts":"2020-03-06T10:05:26.677296Z","peer":"proxy-1","mem":95,"pkts_proc":0,"bytes_recv":0,"events_proc":2101,"events_queued":2101,"active_tcp_conns"
>>>> :0,"active_udp_conns":0,"active_icmp_conns":0,"tcp_conns":0,"udp_conns":0,"icmp_conns":0,"timers":933,"active_timers":39,"files":0,"active_files":0,"dns_requests":0,"active_dns_requests":0,"reassem_tcp_size"
>>>> :0,"reassem_file_size":0,"reassem_frag_size":0,"reassem_unknown_size":0}
>>>> {"_path":"stats","_write_ts":"2020-03-06T10:05:28.087831Z","ts":"2020-03-06T10:05:28.087831Z","peer":"worker-1-1","mem":494,"pkts_proc":4024429,"bytes_recv":3075761558,"pkts_dropped":0,"pkts_link":4029762,"p
>>>> kt_lag":0.012360095977783203,"events_proc":860917,"events_queued":860919,"active_tcp_conns":3043,"active_udp_conns":505,"active_icmp_conns":149,"tcp_conns":10006,"udp_conns":7298,"icmp_conns":483,"timers":23
>>>> 2394,"active_timers":12492,"files":11763,"active_files":13,"dns_requests":0,"active_dns_requests":0,"reassem_tcp_size":60920,"reassem_file_size":955592,"reassem_frag_size":0,"reassem_unknown_size":0}
>>>>
>>>> My network interface reports 0 drops:
>>>>
>>>> NIC statistics:
>>>>      rx_packets: 57185146109
>>>>      tx_packets: 118236
>>>>      rx_bytes: 51106679706383
>>>>      tx_bytes: 12060072
>>>>      rx_broadcast: 28116614
>>>>      tx_broadcast: 0
>>>>      rx_multicast: 430062675
>>>>      tx_multicast: 0
>>>>      multicast: 430062675
>>>>      collisions: 0
>>>>      rx_crc_errors: 0
>>>>      rx_no_buffer_count: 0
>>>>      rx_missed_errors: 0
>>>>      tx_aborted_errors: 0
>>>>      tx_carrier_errors: 0
>>>>      tx_window_errors: 0
>>>>      tx_abort_late_coll: 0
>>>>      tx_deferred_ok: 0
>>>>      tx_single_coll_ok: 0
>>>>      tx_multi_coll_ok: 0
>>>>      tx_timeout_count: 0
>>>>      rx_long_length_errors: 0
>>>>      rx_short_length_errors: 0
>>>>      rx_align_errors: 0
>>>>      tx_tcp_seg_good: 0
>>>>      tx_tcp_seg_failed: 0
>>>>      rx_flow_control_xon: 0
>>>>      rx_flow_control_xoff: 0
>>>>      tx_flow_control_xon: 0
>>>>      tx_flow_control_xoff: 0
>>>>      rx_long_byte_count: 51106679706383
>>>>      tx_dma_out_of_sync: 0
>>>>      tx_smbus: 0
>>>>      rx_smbus: 0
>>>>      dropped_smbus: 0
>>>>      os2bmc_rx_by_bmc: 0
>>>>      os2bmc_tx_by_bmc: 0
>>>>      os2bmc_tx_by_host: 0
>>>>      os2bmc_rx_by_host: 0
>>>>      tx_hwtstamp_timeouts: 0
>>>>      rx_hwtstamp_cleared: 0
>>>>      rx_errors: 0
>>>>      tx_errors: 0
>>>>      tx_dropped: 0
>>>>      rx_length_errors: 0
>>>>      rx_over_errors: 0
>>>>      rx_frame_errors: 0
>>>>      rx_fifo_errors: 0
>>>>      tx_fifo_errors: 0
>>>>      tx_heartbeat_errors: 0
>>>>      tx_queue_0_packets: 0
>>>> tx_queue_0_restart: 0
>>>>      tx_queue_1_packets: 118236
>>>>      tx_queue_1_bytes: 11587128
>>>>      tx_queue_1_restart: 0
>>>>      tx_queue_2_packets: 0
>>>>      tx_queue_2_bytes: 0
>>>>      tx_queue_2_restart: 0
>>>>      tx_queue_3_packets: 0
>>>>      tx_queue_3_bytes: 0
>>>>      tx_queue_3_restart: 0
>>>>      tx_queue_4_packets: 0
>>>>      tx_queue_4_bytes: 0
>>>>      tx_queue_4_restart: 0
>>>>      tx_queue_5_packets: 0
>>>>      tx_queue_5_bytes: 0
>>>>      tx_queue_5_restart: 0
>>>>      tx_queue_6_packets: 0
>>>>      tx_queue_6_bytes: 0
>>>>      tx_queue_6_restart: 0
>>>>      tx_queue_7_packets: 0
>>>>      tx_queue_7_bytes: 0
>>>>      tx_queue_7_restart: 0
>>>>      rx_queue_0_packets: 7309311690
>>>>      rx_queue_0_bytes: 6672057827542
>>>>      rx_queue_0_drops: 0
>>>>      rx_queue_0_csum_err: 0
>>>>      rx_queue_0_alloc_failed: 0
>>>>      rx_queue_1_packets: 7067404359
>>>>      rx_queue_1_bytes: 5978548722708
>>>>      rx_queue_1_drops: 0
>>>>      rx_queue_1_csum_err: 0
>>>>      rx_queue_1_alloc_failed: 0
>>>>      rx_queue_2_packets: 6936589456
>>>>      rx_queue_2_bytes: 5816850955623
>>>>      rx_queue_2_drops: 0
>>>>      rx_queue_2_csum_err: 0
>>>>      rx_queue_2_alloc_failed: 0
>>>>      rx_queue_3_packets: 7560820836
>>>>      rx_queue_3_bytes: 7177372551363
>>>>      rx_queue_3_drops: 0
>>>>      rx_queue_3_csum_err: 0
>>>>      rx_queue_3_alloc_failed: 0
>>>>      rx_queue_4_packets: 6665690657
>>>>      rx_queue_4_bytes: 5815406197188
>>>>      rx_queue_4_drops: 0
>>>>      rx_queue_4_csum_err: 0
>>>>      rx_queue_4_alloc_failed: 0
>>>>      rx_queue_5_packets: 7245905157
>>>>      rx_queue_5_bytes: 6640952714842
>>>>      rx_queue_5_drops: 0
>>>>      rx_queue_5_csum_err: 0
>>>> tx_queue_5_restart: 0
>>>>      tx_queue_6_packets: 0
>>>>      tx_queue_6_bytes: 0
>>>>      tx_queue_6_restart: 0
>>>>      tx_queue_7_packets: 0
>>>>      tx_queue_7_bytes: 0
>>>>      tx_queue_7_restart: 0
>>>>      rx_queue_0_packets: 7309311690
>>>>      rx_queue_0_bytes: 6672057827542
>>>>      rx_queue_0_drops: 0
>>>>      rx_queue_0_csum_err: 0
>>>>      rx_queue_0_alloc_failed: 0
>>>>      rx_queue_1_packets: 7067404359
>>>>      rx_queue_1_bytes: 5978548722708
>>>>      rx_queue_1_drops: 0
>>>>      rx_queue_1_csum_err: 0
>>>>      rx_queue_1_alloc_failed: 0
>>>>      rx_queue_2_packets: 6936589456
>>>>      rx_queue_2_bytes: 5816850955623
>>>>      rx_queue_2_drops: 0
>>>>      rx_queue_2_csum_err: 0
>>>>      rx_queue_2_alloc_failed: 0
>>>>      rx_queue_3_packets: 7560820836
>>>>      rx_queue_3_bytes: 7177372551363
>>>>      rx_queue_3_drops: 0
>>>>      rx_queue_3_csum_err: 0
>>>>      rx_queue_3_alloc_failed: 0
>>>>      rx_queue_4_packets: 6665690657
>>>>      rx_queue_4_bytes: 5815406197188
>>>>      rx_queue_4_drops: 0
>>>>      rx_queue_4_csum_err: 0
>>>>      rx_queue_4_alloc_failed: 0
>>>>      rx_queue_5_packets: 7245905157
>>>>      rx_queue_5_bytes: 6640952714842
>>>>      rx_queue_5_drops: 0
>>>>      rx_queue_5_csum_err: 0
>>>>      rx_queue_5_alloc_failed: 0
>>>>      rx_queue_6_packets: 7693400503
>>>>      rx_queue_6_bytes: 6803443308105
>>>>      rx_queue_6_drops: 0
>>>>      rx_queue_6_csum_err: 0
>>>>      rx_queue_6_alloc_failed: 0
>>>>      rx_queue_7_packets: 6706023451
>>>>      rx_queue_7_bytes: 5768995611822
>>>>      rx_queue_7_drops: 0
>>>>      rx_queue_7_csum_err: 0
>>>>      rx_queue_7_alloc_failed: 0
>>>>
>>>> Is there something am I missing? Is it a way to further analyze the
>>>> problem? By looking in zeek logs everything looks fine.
>>>> --
>>>> Federico Foschini.
>>>> _______________________________________________
>>>> Zeek mailing list
>>>> zeek at zeek.org
>>>> http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/zeek
>>>
>>>
>>
>> --
>> Federico Foschini.
>>
>
>
> --
> Federico Foschini.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/zeek/attachments/20200310/edc96229/attachment-0001.html 


More information about the Zeek mailing list