[Bro] So uh...how do you know which pin_cpus to use?
Azoff, Justin S
jazoff at illinois.edu
Tue Oct 18 15:28:03 PDT 2016
> On Oct 18, 2016, at 6:18 PM, Michał Purzyński <michalpurzynski1 at gmail.com> wrote:
> 2.6 kernels on Linux enumerate HT in a different way 3.x and 4.x do
> Core 0 thread 0
> Core 0 thread 1
> Core 0-N on CPU 0 first half of threads
> Then CPU 1
> Then CPU 0 second half of threads
> Then CPU 1
> Results for HT vs cross numa are about to be published, soon ;)
> I don't like cache misses when CPU 1 is reaching for data on node 0 though. It is not about cross numa bandwidth it's the fact then you have in the worst case 67ns to process a smallest packet on 10Gbit. And L3 hit on ivy bridge is at least 15ns.
> Miss is 5x that.
Ah! That explains a lot. I wonder if numa allocation changed too. We just upgraded some machines from centos6 to 7 and I was wondering how the meticulously written node.cfg we had been using for months now appeared completely wrong.
I wonder if broctl should support hwloc for cpu pinning instead of task set. I wouldn't mind having an 'auto' mode that just does the right thing.
It looks like on our dual socket numa box we should be using
0,2,4,6,8,10,12,14 for one 10g card and
1,3,5,7,9,11,13,15 for the other 10g card
0-19 are the physical cores and 20-39 are the HT cores, but using 0,1,2,3 flips between numa nodes which is not what anyone wants.
- Justin Azoff
More information about the Bro