[Zeek] Capture packet loss discrepancy

Federico Foschini undicizeri at gmail.com
Fri Mar 6 02:53:42 PST 2020


I’m also adding my /etc/network/interfaces:

auto eno2
iface eno2 inet manual
  up ip link set $IFACE promisc on arp off up
  down ip link set $IFACE promisc off down
  post-up ethtool -G $IFACE rx 4096; for i in rx tx sg tso ufo gso gro
lro; do ethtool -K $IFACE $i off; done
  post-up echo 1 > /proc/sys/net/ipv6/conf/$IFACE/disable_ipv6
  bond-master bond0

auto bond0
iface bond0 inet manual
  bond-mode balance-rr
  bond-slaves eno2
  mtu 1500
  up ip link set bond0 promisc on arp off up
  down ip link set bond0 promisc off down
  post-up ethtool -G bond0 rx 4096; for i in rx tx sg tso ufo gso gro
lro; do ethtool -K bond0 $i off; done
  post-up echo 1 > /proc/sys/net/ipv6/conf/bond0/disable_ipv6

I’ve configured a bond interface because I’m planning to add another span
port. I’ve copied this configuration from SecurityOnion github:
https://github.com/Security-Onion-Solutions/securityonion-saltstack/blob/d3826bc605a22e090bc7cf07309bce76a2f12623/setup/so-functions#L298

Il giorno ven 6 mar 2020 alle ore 11:47 Federico Foschini <
undicizeri at gmail.com> ha scritto:

> Hello,
> thank you for your help.
> I’m configured a span port on the switch.
>
> In Zeek I’ve configured the AF_Packet plugin.
>
> This is my node.cfg:
>
> [manager]
> type=manager
> host=localhost
>
> [proxy-1]
> type=proxy
> host=localhost
>
> [worker-1]
> type=worker
> host=localhost
> interface=bond0
> lb_method=custom
> lb_procs=1
> af_packet_fanout_id=21
> af_packet_fanout_mode=AF_Packet::FANOUT_HASH
> af_packet_buffer_size=128*1024*1024
>
> This is the ethtool output:
>
> ethtool -c eno2
> Coalesce parameters for eno2:
> Adaptive RX: off  TX: off
> stats-block-usecs: 0
> sample-interval: 0
> pkt-rate-low: 0
> pkt-rate-high: 0
>
> rx-usecs: 3
> rx-frames: 0
> rx-usecs-irq: 0
> rx-frames-irq: 0
>
> tx-usecs: 0
> tx-frames: 0
> tx-usecs-irq: 0
> tx-frames-irq: 0
>
> rx-usecs-low: 0
> rx-frame-low: 0
> tx-usecs-low: 0
> tx-frame-low: 0
>
> rx-usecs-high: 0
> rx-frame-high: 0
> tx-usecs-high: 0
> tx-frame-high: 0
>
> ethtool -l eno2
> Channel parameters for eno2:
> Pre-set maximums:
> RX:             0
> TX:             0
> Other:          1
> Combined:       8
> Current hardware settings:
> RX:             0
> TX:             0
> Other:          1
> Combined:       8
>
> ethtool -x eno2
> RX flow hash indirection table for eno2 with 8 RX ring(s):
>     0:      0     0     0     0     0     0     0     0
>     8:      0     0     0     0     0     0     0     0
>    16:      1     1     1     1     1     1     1     1
>    24:      1     1     1     1     1     1     1     1
>    32:      2     2     2     2     2     2     2     2
>    40:      2     2     2     2     2     2     2     2
>    48:      3     3     3     3     3     3     3     3
>    56:      3     3     3     3     3     3     3     3
>    64:      4     4     4     4     4     4     4     4
>    72:      4     4     4     4     4     4     4     4
>    80:      5     5     5     5     5     5     5     5
>    88:      5     5     5     5     5     5     5     5
>    96:      6     6     6     6     6     6     6     6
>   104:      6     6     6     6     6     6     6     6
>   112:      7     7     7     7     7     7     7     7
>   120:      7     7     7     7     7     7     7     7
> RSS hash key:
> Operation not supported
>
> ethtool -a eno2
> Pause parameters for eno2:
> Autonegotiate:  on
> RX:             on
> TX:             off
>
> ethtool -g eno2
> Ring parameters for eno2:
> Pre-set maximums:
> RX:             4096
> RX Mini:        0
> RX Jumbo:       0
> TX:             4096
> Current hardware settings:
> RX:             4096
> RX Mini:        0
> RX Jumbo:       0
> TX:             256
>
>
> Il giorno ven 6 mar 2020 alle ore 11:39 Michał Purzyński <
> michalpurzynski1 at gmail.com> ha scritto:
>
>> You either drop data before it reaches Zeek sensors or you have an
>> offloading missconfiguration on the card or you have a load-balancing
>> problem. That might be a good news.
>>
>> The capture.log file is based on heuristics "I have seen some ACKs
>> without matching data packets" - but won't tell you where the problem is.
>> Justin calls it a "check engine light".
>>
>> You don't seem to have drops in stats.log and in ethtool stats. That's
>> great. Stats.log meaning is different for each capture method and it's
>> never enough without some ethtool stats.
>>
>> Can you tell us more about your setup?
>> - how are you feeding your sensor (taps, span ports, etc)
>> - what's your capture mechanism?
>> - can you share all commands for interface's configuration?
>> - can you share the node.cfg
>> - output of
>>
>> ethtool -c <interface>
>> ethtool -l <interface>
>> ethtool -x <interface>
>> ethtool -a <interface>
>> ethtool -g <interface>
>>
>> On Fri, Mar 6, 2020 at 2:19 AM Federico Foschini <undicizeri at gmail.com>
>> wrote:
>>
>>> Hello,
>>> I’m using Zeek 3.0.1 and I’m seeing very high zeek capture loss even if
>>> the system load is very low (I’m analyzing 50-100mbps of traffic with a
>>> Xeon 8C-16T and 32GiB of ram, the system load is barely rearching 1).
>>>
>>> This is what
>>> https://docs.zeek.org/en/stable/scripts/policy/misc/capture-loss.zeek.html
>>> is reporting:
>>>
>>> {"_path":"capture_loss","_write_ts":"2020-03-06T10:00:28.080623Z","ts":"2020-03-06T10:00:28.080623Z","ts_delta":900.0000138282776,"peer":"worker-1-1","gaps":980763,"acks":3451717,"percent_lost":28.4137720444
>>> 6367}
>>> {"_path":"capture_loss","_write_ts":"2020-03-06T09:45:28.080609Z","ts":"2020-03-06T09:45:28.080609Z","ts_delta":900.0000011920929,"peer":"worker-1-1","gaps":775832,"acks":3944662,"percent_lost":19.6678955002
>>> 98886}
>>>
>>> But in Zeek stats logs I cannot see any drops:
>>>
>>> {"_path":"stats","_write_ts":"2020-03-06T10:00:25.184307Z","ts":"2020-03-06T10:00:25.184307Z","peer":"manager","mem":99,"pkts_proc":0,"bytes_recv":0,"events_proc":2171,"events_queued":2175,"active_tcp_conns"
>>> :0,"active_udp_conns":0,"active_icmp_conns":0,"tcp_conns":0,"udp_conns":0,"icmp_conns":0,"timers":1097,"active_timers":77,"files":0,"active_files":0,"dns_requests":0,"active_dns_requests":0,"reassem_tcp_size
>>> ":0,"reassem_file_size":0,"reassem_frag_size":0,"reassem_unknown_size":0}
>>> {"_path":"stats","_write_ts":"2020-03-06T10:00:26.676726Z","ts":"2020-03-06T10:00:26.676726Z","peer":"proxy-1","mem":95,"pkts_proc":0,"bytes_recv":0,"events_proc":1579,"events_queued":1579,"active_tcp_conns"
>>> :0,"active_udp_conns":0,"active_icmp_conns":0,"tcp_conns":0,"udp_conns":0,"icmp_conns":0,"timers":939,"active_timers":39,"files":0,"active_files":0,"dns_requests":0,"active_dns_requests":0,"reassem_tcp_size"
>>> :0,"reassem_file_size":0,"reassem_frag_size":0,"reassem_unknown_size":0}
>>> {"_path":"stats","_write_ts":"2020-03-06T10:00:28.087807Z","ts":"2020-03-06T10:00:28.087807Z","peer":"worker-1-1","mem":494,"pkts_proc":4342283,"bytes_recv":3395272841,"pkts_dropped":0,"pkts_link":4347989,"p
>>> kt_lag":0.010818004608154297,"events_proc":894955,"events_queued":894955,"active_tcp_conns":2944,"active_udp_conns":473,"active_icmp_conns":70,"tcp_conns":9036,"udp_conns":7376,"icmp_conns":383,"timers":2168
>>> 06,"active_timers":11468,"files":10693,"active_files":10,"dns_requests":2,"active_dns_requests":0,"reassem_tcp_size":500680,"reassem_file_size":451064,"reassem_frag_size":0,"reassem_unknown_size":0}
>>> {"_path":"stats","_write_ts":"2020-03-06T10:05:25.185129Z","ts":"2020-03-06T10:05:25.185129Z","peer":"manager","mem":99,"pkts_proc":0,"bytes_recv":0,"events_proc":3573,"events_queued":3569,"active_tcp_conns"
>>> :0,"active_udp_conns":0,"active_icmp_conns":0,"tcp_conns":0,"udp_conns":0,"icmp_conns":0,"timers":1063,"active_timers":77,"files":0,"active_files":0,"dns_requests":0,"active_dns_requests":0,"reassem_tcp_size
>>> ":0,"reassem_file_size":0,"reassem_frag_size":0,"reassem_unknown_size":0}
>>> {"_path":"stats","_write_ts":"2020-03-06T10:05:26.677296Z","ts":"2020-03-06T10:05:26.677296Z","peer":"proxy-1","mem":95,"pkts_proc":0,"bytes_recv":0,"events_proc":2101,"events_queued":2101,"active_tcp_conns"
>>> :0,"active_udp_conns":0,"active_icmp_conns":0,"tcp_conns":0,"udp_conns":0,"icmp_conns":0,"timers":933,"active_timers":39,"files":0,"active_files":0,"dns_requests":0,"active_dns_requests":0,"reassem_tcp_size"
>>> :0,"reassem_file_size":0,"reassem_frag_size":0,"reassem_unknown_size":0}
>>> {"_path":"stats","_write_ts":"2020-03-06T10:05:28.087831Z","ts":"2020-03-06T10:05:28.087831Z","peer":"worker-1-1","mem":494,"pkts_proc":4024429,"bytes_recv":3075761558,"pkts_dropped":0,"pkts_link":4029762,"p
>>> kt_lag":0.012360095977783203,"events_proc":860917,"events_queued":860919,"active_tcp_conns":3043,"active_udp_conns":505,"active_icmp_conns":149,"tcp_conns":10006,"udp_conns":7298,"icmp_conns":483,"timers":23
>>> 2394,"active_timers":12492,"files":11763,"active_files":13,"dns_requests":0,"active_dns_requests":0,"reassem_tcp_size":60920,"reassem_file_size":955592,"reassem_frag_size":0,"reassem_unknown_size":0}
>>>
>>> My network interface reports 0 drops:
>>>
>>> NIC statistics:
>>>      rx_packets: 57185146109
>>>      tx_packets: 118236
>>>      rx_bytes: 51106679706383
>>>      tx_bytes: 12060072
>>>      rx_broadcast: 28116614
>>>      tx_broadcast: 0
>>>      rx_multicast: 430062675
>>>      tx_multicast: 0
>>>      multicast: 430062675
>>>      collisions: 0
>>>      rx_crc_errors: 0
>>>      rx_no_buffer_count: 0
>>>      rx_missed_errors: 0
>>>      tx_aborted_errors: 0
>>>      tx_carrier_errors: 0
>>>      tx_window_errors: 0
>>>      tx_abort_late_coll: 0
>>>      tx_deferred_ok: 0
>>>      tx_single_coll_ok: 0
>>>      tx_multi_coll_ok: 0
>>>      tx_timeout_count: 0
>>>      rx_long_length_errors: 0
>>>      rx_short_length_errors: 0
>>>      rx_align_errors: 0
>>>      tx_tcp_seg_good: 0
>>>      tx_tcp_seg_failed: 0
>>>      rx_flow_control_xon: 0
>>>      rx_flow_control_xoff: 0
>>>      tx_flow_control_xon: 0
>>>      tx_flow_control_xoff: 0
>>>      rx_long_byte_count: 51106679706383
>>>      tx_dma_out_of_sync: 0
>>>      tx_smbus: 0
>>>      rx_smbus: 0
>>>      dropped_smbus: 0
>>>      os2bmc_rx_by_bmc: 0
>>>      os2bmc_tx_by_bmc: 0
>>>      os2bmc_tx_by_host: 0
>>>      os2bmc_rx_by_host: 0
>>>      tx_hwtstamp_timeouts: 0
>>>      rx_hwtstamp_cleared: 0
>>>      rx_errors: 0
>>>      tx_errors: 0
>>>      tx_dropped: 0
>>>      rx_length_errors: 0
>>>      rx_over_errors: 0
>>>      rx_frame_errors: 0
>>>      rx_fifo_errors: 0
>>>      tx_fifo_errors: 0
>>>      tx_heartbeat_errors: 0
>>>      tx_queue_0_packets: 0
>>> tx_queue_0_restart: 0
>>>      tx_queue_1_packets: 118236
>>>      tx_queue_1_bytes: 11587128
>>>      tx_queue_1_restart: 0
>>>      tx_queue_2_packets: 0
>>>      tx_queue_2_bytes: 0
>>>      tx_queue_2_restart: 0
>>>      tx_queue_3_packets: 0
>>>      tx_queue_3_bytes: 0
>>>      tx_queue_3_restart: 0
>>>      tx_queue_4_packets: 0
>>>      tx_queue_4_bytes: 0
>>>      tx_queue_4_restart: 0
>>>      tx_queue_5_packets: 0
>>>      tx_queue_5_bytes: 0
>>>      tx_queue_5_restart: 0
>>>      tx_queue_6_packets: 0
>>>      tx_queue_6_bytes: 0
>>>      tx_queue_6_restart: 0
>>>      tx_queue_7_packets: 0
>>>      tx_queue_7_bytes: 0
>>>      tx_queue_7_restart: 0
>>>      rx_queue_0_packets: 7309311690
>>>      rx_queue_0_bytes: 6672057827542
>>>      rx_queue_0_drops: 0
>>>      rx_queue_0_csum_err: 0
>>>      rx_queue_0_alloc_failed: 0
>>>      rx_queue_1_packets: 7067404359
>>>      rx_queue_1_bytes: 5978548722708
>>>      rx_queue_1_drops: 0
>>>      rx_queue_1_csum_err: 0
>>>      rx_queue_1_alloc_failed: 0
>>>      rx_queue_2_packets: 6936589456
>>>      rx_queue_2_bytes: 5816850955623
>>>      rx_queue_2_drops: 0
>>>      rx_queue_2_csum_err: 0
>>>      rx_queue_2_alloc_failed: 0
>>>      rx_queue_3_packets: 7560820836
>>>      rx_queue_3_bytes: 7177372551363
>>>      rx_queue_3_drops: 0
>>>      rx_queue_3_csum_err: 0
>>>      rx_queue_3_alloc_failed: 0
>>>      rx_queue_4_packets: 6665690657
>>>      rx_queue_4_bytes: 5815406197188
>>>      rx_queue_4_drops: 0
>>>      rx_queue_4_csum_err: 0
>>>      rx_queue_4_alloc_failed: 0
>>>      rx_queue_5_packets: 7245905157
>>>      rx_queue_5_bytes: 6640952714842
>>>      rx_queue_5_drops: 0
>>>      rx_queue_5_csum_err: 0
>>> tx_queue_5_restart: 0
>>>      tx_queue_6_packets: 0
>>>      tx_queue_6_bytes: 0
>>>      tx_queue_6_restart: 0
>>>      tx_queue_7_packets: 0
>>>      tx_queue_7_bytes: 0
>>>      tx_queue_7_restart: 0
>>>      rx_queue_0_packets: 7309311690
>>>      rx_queue_0_bytes: 6672057827542
>>>      rx_queue_0_drops: 0
>>>      rx_queue_0_csum_err: 0
>>>      rx_queue_0_alloc_failed: 0
>>>      rx_queue_1_packets: 7067404359
>>>      rx_queue_1_bytes: 5978548722708
>>>      rx_queue_1_drops: 0
>>>      rx_queue_1_csum_err: 0
>>>      rx_queue_1_alloc_failed: 0
>>>      rx_queue_2_packets: 6936589456
>>>      rx_queue_2_bytes: 5816850955623
>>>      rx_queue_2_drops: 0
>>>      rx_queue_2_csum_err: 0
>>>      rx_queue_2_alloc_failed: 0
>>>      rx_queue_3_packets: 7560820836
>>>      rx_queue_3_bytes: 7177372551363
>>>      rx_queue_3_drops: 0
>>>      rx_queue_3_csum_err: 0
>>>      rx_queue_3_alloc_failed: 0
>>>      rx_queue_4_packets: 6665690657
>>>      rx_queue_4_bytes: 5815406197188
>>>      rx_queue_4_drops: 0
>>>      rx_queue_4_csum_err: 0
>>>      rx_queue_4_alloc_failed: 0
>>>      rx_queue_5_packets: 7245905157
>>>      rx_queue_5_bytes: 6640952714842
>>>      rx_queue_5_drops: 0
>>>      rx_queue_5_csum_err: 0
>>>      rx_queue_5_alloc_failed: 0
>>>      rx_queue_6_packets: 7693400503
>>>      rx_queue_6_bytes: 6803443308105
>>>      rx_queue_6_drops: 0
>>>      rx_queue_6_csum_err: 0
>>>      rx_queue_6_alloc_failed: 0
>>>      rx_queue_7_packets: 6706023451
>>>      rx_queue_7_bytes: 5768995611822
>>>      rx_queue_7_drops: 0
>>>      rx_queue_7_csum_err: 0
>>>      rx_queue_7_alloc_failed: 0
>>>
>>> Is there something am I missing? Is it a way to further analyze the
>>> problem? By looking in zeek logs everything looks fine.
>>> --
>>> Federico Foschini.
>>> _______________________________________________
>>> Zeek mailing list
>>> zeek at zeek.org
>>> http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/zeek
>>
>>
>
> --
> Federico Foschini.
>


-- 
Federico Foschini.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/zeek/attachments/20200306/6638b46c/attachment-0001.html 


More information about the Zeek mailing list