[Zeek] Zeek Myricom port aggregation

Justin Azoff justin at corelight.com
Wed Jun 5 11:39:43 PDT 2019


Oh, I forgot to send you the recommended configuration for 2 cards in one
box..

Most likely you don't need to be merging the ports... as long as the arista
or something is merging the flows for you each port is already getting a
consistent subset of flows.  At that point the on card aggregation doesn't
do anything for you.

I would use a configuration like this:

[node-foo-card1]
interface = p1p1
lb_method=myricom
lb_procs=9
#check hwloc for numa/pci info
pin_cpus=1,3,5,7,9,...

[node-foo-card2]
interface = p2p1
lb_method=myricom
lb_procs=9
#check hwloc for numa/pci info
pin_cpus=2,4,6,8,...

the hardest part is using the right pin_cpus settings.  It's a little
easier if you disable HT and then check to see which card is attached to
which cpu using hwloc.  sometimes it doesn't matter much, but on some
motherboards you can make sure you match up pci slots to physical cpus to
avoid moving data between the numa nodes.



On Wed, Jun 5, 2019 at 2:32 PM Greg Grasmehr <greg.grasmehr at caltech.edu>
wrote:

> Just an update:
>
> I contacted Myricom support about this issue a while back and haven't
> heard anything in a while from them so I believe they are unable to
> duplicate it perhaps, as they generally fix kernel problems very quickly
> in my experience.
>
> Fortunately I will be swapping drives in an array and will need to take
> Zeek down, so I will experiment for a bit before bringing it back up and
> see if I can figure out what the issue is.
>
> This kind of experimentation is very difficult when you don't have a dev
> system to test on.  :P
>
> Greg
>
> On 05/15/19 15:04:40, Greg Grasmehr wrote:
> > tcpdump works perfectly with aggregation, no issues
> >
> > On 05/15/19 17:35:56, Justin Azoff wrote:
> > > That looks like a bug in the myricom Driver and not zeek.  Can you
> > > reproduce the same kernel issue using tcpdump?  You configure
> > > aggregation for that using SNF_FLAGS:
> > >
> > > SNF_FLAGS=0x2 (Port aggregation (or merging))
> > > Flag 0x2 says that the port number that is passed to an application is
> actually
> > > a mask of port, not just one port.
> > > For example, when using tcpdump:
> > > export SNF_FLAGS=0x2
> > > env SNF_FLAGS=0x2 /path/to/tcpdump -i snf3
> > >
> > > Without SNF_FLAGS=0x2, you would actually try to open snf port 3 (which
> > > may not exist if you only have one adapter.)
> > >
> > >
> > > It's possible that you don't need to use aggregation in the first
> > > place,  That is generally only needed if you are connecting a fiber
> > > tap directly into a card.  If flows are being load balanced across
> > > multiple ports you can just run two different sets of workers, one for
> > > each port
> > >
> > > On Wed, May 15, 2019 at 2:17 PM Greg Grasmehr <
> greg.grasmehr at caltech.edu> wrote:
> > > >
> > > > Hello,
> > > >
> > > > Hoping someone has some insight into whatever I am doing wrong as
> try as
> > > > I might, I can't seem to get the Myricom plugin working if
> configured to
> > > > aggregate port data.  Zeek starts and then crashes in every case,
> > > > regardless of configuration ie
> > > >
> > > > interface=myricom::3
> > > > interface=myricom::*
> > > >
> > > > and snf_aggregate = T
> > > >
> > > > Here is related dmesg output logged by kdump
> > > >
> > > > [67471.838822] BUG: unable to handle kernel paging request at
> 00007f0d8459607f
> > > > [67471.838863] IP: [<ffffffffc0bed569>] snf_eop_ioctl+0x609/0xc60
> [myri_snf]
> > > > [67471.838897] PGD 8000000a93bb9067 PUD 12d142c067 PMD 12d142d067
> PTE 8000001d54829025
> > > > [67471.838927] Oops: 0001 [#1] SMP
> > > > [67471.838942] Modules linked in: binfmt_misc macsec tcp_diag
> udp_diag inet_diag unix_diag af_packet_diag netlink_diag myri_snf(OE)
> mpt2sas raid_class scsi_transport_sas mptctl mptbase ip6t_rpfilter
> ipt_REJECT nf_reject_ipv4 nf_log_ipv4 ip6t_REJECT nf_reject_ipv6
> nf_log_ipv6 nf_log_common xt_LOG xt_conntrack ip_set nfnetlink ebtable_nat
> ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6
> nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat
> nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack
> iptable_mangle iptable_security iptable_raw ebtable_filter ebtables
> ip6table_filter ip6_tables iptable_filter dell_rbu sunrpc dcdbas iTCO_wdt
> iTCO_vendor_support sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi
> kvm_intel kvm irqbypass crc32_pclmul joydev
> > > > [67471.839241]  ghash_clmulni_intel aesni_intel lrw gf128mul
> glue_helper ablk_helper cryptd mxm_wmi ext4 mbcache jbd2 pcspkr ipmi_ssif
> mei_me lpc_ich mei sg ipmi_si ipmi_devintf ipmi_msghandler wmi
> acpi_power_meter ip_tables xfs libcrc32c sd_mod crc_t10dif
> crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea
> sysfillrect sysimgblt fb_sys_fops ttm drm crct10dif_pclmul crct10dif_common
> crc32c_intel drm_panel_orientation_quirks ahci libahci dca libata tg3
> megaraid_sas ptp pps_core dm_mirror dm_region_hash dm_log dm_mod [last
> unloaded: myri10ge]
> > > > [67471.839450] CPU: 24 PID: 92952 Comm: bro Kdump: loaded Tainted:
> G           OE  ------------   3.10.0-957.10.1.el7.x86_64 #1
> > > > [67471.839483] Hardware name: Dell Inc. PowerEdge R730xd/072T6D,
> BIOS 2.9.1 12/04/2018
> > > > [67471.839508] task: ffff95d0e7c41040 ti: ffff95e3197c0000 task.ti:
> ffff95e3197c0000
> > > > [67471.839531] RIP: 0010:[<ffffffffc0bed569>]  [<ffffffffc0bed569>]
> snf_eop_ioctl+0x609/0xc60 [myri_snf]
> > > > [67471.839564] RSP: 0018:ffff95e3197c3d38  EFLAGS: 00010006
> > > > [67471.839583] RAX: 0000000000000286 RBX: 0000000000000001 RCX:
> 0000000000000000
> > > > [67471.839605] RDX: ffff95d0526253d0 RSI: 00007f0d84596000 RDI:
> ffffb70f589ba7f8
> > > > [67471.839627] RBP: ffff95e3197c3df8 R08: ffffb70f599bb000 R09:
> 0000000000000003
> > > > [67471.839648] R10: 0000000000000000 R11: 0000000000000000 R12:
> ffff95d052625000
> > > > [67471.839670] R13: 00007ffeb542d710 R14: 00007ffeb542d710 R15:
> 0000000000000000
> > > > [67471.839693] FS:  00007f180d6a7900(0000) GS:ffff95eefe900000(0000)
> knlGS:0000000000000000
> > > > [67471.839717] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [67471.839735] CR2: 00007f0d8459607f CR3: 0000001ff663c000 CR4:
> 00000000003607e0
> > > > [67471.839757] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> > > > [67471.839778] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> > > > [67471.839800] Call Trace:
> > > > [67471.839818]  [<ffffffff98d67ef2>] ? down_read+0x12/0x40
> > > > [67471.839840]  [<ffffffffc0bdfba0>] mx_common_ioctl+0x40/0x90
> [myri_snf]
> > > > [67471.839865]  [<ffffffffc0bd44e2>] mx_ioctl+0x72/0x290 [myri_snf]
> > > > [67471.839888]  [<ffffffff98856880>] do_vfs_ioctl+0x3a0/0x5a0
> > > > [67471.839908]  [<ffffffff98d70608>] ? __do_page_fault+0x228/0x500
> > > > [67471.839928]  [<ffffffff98856b21>] SyS_ioctl+0xa1/0xc0
> > > > [67471.839947]  [<ffffffff98d75ddb>] system_call_fastpath+0x22/0x27
> > > > [67471.839966] Code: d3 e6 44 85 ce 74 e1 48 83 bf b8 00 00 00 00 75
> d1 4c 8b 87 c0 00 00 00 4c 63 d9 41 8b 70 04 48 c1 e6 09 4b 03 b4 dc c0 06
> 00 00 <0f> b6 76 7f 41 39 30 75 b4 4c 89 a7 b8 00 00 00 49 89 bc 24 60
> > > > [67471.840084] RIP  [<ffffffffc0bed569>] snf_eop_ioctl+0x609/0xc60
> [myri_snf]
> > > > [67471.840112]  RSP <ffff95e3197c3d38>
> > > > [67471.840125] CR2: 00007f0d8459607f
> > > >
> > > > _______________________________________________
> > > > Zeek mailing list
> > > > zeek at zeek.org
> > > > http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/zeek
> > >
> > >
> > >
> > > --
> > > Justin
> > _______________________________________________
> > Zeek mailing list
> > zeek at zeek.org
> > http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/zeek
>


-- 
Justin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/zeek/attachments/20190605/cfdaa518/attachment-0001.html 


More information about the Zeek mailing list