[Zeek] Workers occasionally using 102% CPU

Thu Feb 20 14:03:52 PST 2020

I'm seeing an interesting problem on zeek 3.0.1 (running stock
SecurityOnion sensor setup) where the main thread suddenly spikes to 100%
CPU and stays there.

The base setup: Ubuntu 16.04 kernel 4.15.0-88-generic, zeek is using
AF_PACKET to distribute packets across 8 worker processes.  Baseline load
for each worker is around 18-22% CPU.  Running strace for 1 second and
filtering with 'sort strace.out | uniq -c' on a normal worker looks like
this:

     31 futex(0x7f9586384a38, FUTEX_WAKE_PRIVATE, 1) = 1
     33 futex(0x7f9586384a64, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f9586384a60,
{FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
      1 futex(0x7f9586d03a0c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f9586d03a08,
{FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
   6671 nanosleep({0, 1000}, NULL)              = 0
      1 read(12, "@", 1)                        = 1
      1 read(14, "@", 1)                        = 1
     26 read(16, "@", 1)                        = 1
   6699 select(17, [6 8 10 11 12 14 16], [0], [0], {0, 0}) = 1 (out [0],
left {0, 0})
    198 select(17, [6 8 10 11 12 14 16], [0], [0], {0, 0}) = 2 (in [11],
out [0], left {0, 0})
      1 select(17, [6 8 10 11 12 14 16], [0], [0], {0, 0}) = 2 (in [12],
out [0], left {0, 0})
      1 select(17, [6 8 10 11 12 14 16], [0], [0], {0, 0}) = 2 (in [14],
out [0], left {0, 0})
     21 select(17, [6 8 10 11 12 14 16], [0], [0], {0, 0}) = 2 (in [16],
out [0], left {0, 0})
      1 select(17, [6 8 10 11 12 14 16], [0], [0], {0, 0}) = 3 (in [11 16],
out [0], left {0, 0})
      1 select(17, [6 8 10 11 12 14 16], [0], [0], {0, 0} <detached ...>
    101 select(17, [6 8 10 12 14 16], [], [], {0, 0}) = 0 (Timeout)
     92 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=127, ...}) = 0

Notice there are close to the same number of nanosleep calls as there are
select calls.

After some time - i've seen it happen within 4 minutes of start to several
hours afterwards - the usage suddenly spikes to 100%.  Packets continue to
be processed, and the load on the remaining workers stays about the same,
as does the load on logger, manager, and proxy.  Changing the number of
worker processes doesn't seem to prevent it.  There is no degredation in
output logging, since I have enough cores to compensate for that single hot
process.  Running strace for 1 second and filtering with 'sort strace.out |
uniq -c' looks like this:

      1 futex(0x7f270ef7ea38, FUTEX_WAKE_PRIVATE, 1) = 0
     30 futex(0x7f270ef7ea38, FUTEX_WAKE_PRIVATE, 1) = 1
     35 futex(0x7f270ef7ea64, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f270ef7ea60,
{FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
      1 futex(0x7f270f900a6c, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f270f900a68,
{FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1
      1 poll([{fd=16, events=POLLIN}], 1, -1)   = 1 ([{fd=16,
revents=POLLIN}])
      1 read(12, "@", 1)                        = 1
      1 read(14, "@", 1)                        = 1
     28 read(16, "@", 1)                        = 1
      1 select(12, [10 11], [0], [0], {0, 0})   = 1 (out [0], left {0, 0})
  21703 select(17, [6 8 10 11 12 14 16], [0], [0], {0, 0}) = 2 (in [16],
out [0], left {0, 0})
    141 select(17, [6 8 10 11 12 14 16], [0], [0], {0, 0}) = 3 (in [11 16],
out [0], left {0, 0})
      1 select(17, [6 8 10 11 12 14 16], [0], [0], {0, 0}) = 3 (in [12 16],
out [0], left {0, 0})
    109 select(17, [6 8 10 12 14 16], [], [], {0, 0}) = 1 (in [16], left
{0, 0})
    106 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=127, ...}) = 0

Note the complete lack of nanosleeps and the jump in the number of calls to
select.  Also notice that fds 12, 14, and 16 continue to be read, but
select always returns with data available on fd16.  procfs shows this (also
showing fd17 since that appears to be the other end of the pipe):

root:/proc/32327# ls -l fd/16 fd/17
lr-x------ 1 root root 64 Feb 20 20:43 fd/16 -> pipe:[48070104]
l-wx------ 1 root root 64 Feb 20 20:43 fd/17 -> pipe:[48070104]
root:/proc/32327# cat fdinfo/16
pos: 0
flags: 02004000
mnt_id: 12
root:/proc/32327# cat fdinfo/17
pos: 0
flags: 02000001
mnt_id: 12

lsof output shows these file descriptors in use on the main thread (and
thus on all):

zeek    32327 sguil    3u  a_inode     0,13        0      12847 [eventpoll]
zeek    32327 sguil    4r     FIFO     0,12      0t0   48059893 pipe
zeek    32327 sguil    5w     FIFO     0,12      0t0   48059893 pipe
zeek    32327 sguil    6r     FIFO     0,12      0t0   48059895 pipe
zeek    32327 sguil    7w     FIFO     0,12      0t0   48059895 pipe
zeek    32327 sguil    8r     FIFO     0,12      0t0   48059896 pipe
zeek    32327 sguil    9w     FIFO     0,12      0t0   48059896 pipe
zeek    32327 sguil   10u     IPv4 48059899      0t0        UDP
localhost:55072->localhost:domain
zeek    32327 sguil   11u     pack 48059900      0t0        ALL
type=SOCK_RAW
zeek    32327 sguil   12r     FIFO     0,12      0t0   48048835 pipe
zeek    32327 sguil   13w     FIFO     0,12      0t0   48048835 pipe
zeek    32327 sguil   14r     FIFO     0,12      0t0   48048836 pipe
zeek    32327 sguil   15w     FIFO     0,12      0t0   48048836 pipe
zeek    32327 sguil   16r     FIFO     0,12      0t0   48070104 pipe
zeek    32327 sguil   17w     FIFO     0,12      0t0   48070104 pipe
zeek    32327 sguil   18u     IPv6 48038789      0t0        TCP *:47770
(LISTEN)
zeek    32327 sguil   19u     IPv4 48038790      0t0        TCP
localhost:47996->localhost:47761 (ESTABLISHED)
zeek    32327 sguil   20u     IPv4 48038791      0t0        TCP
localhost:42076->localhost:47763 (ESTABLISHED)
zeek    32327 sguil   21u     IPv4 48038792      0t0        TCP
localhost:34920->localhost:47762 (ESTABLISHED)

Of the 7 threads created, the 4th (judging by the thread's pid) is calling
'write(17, "@", 1)' 10x per second and getting a return value of 1.

Any ideas what might be wrong?  Any suggestions for further diagnosis?
These are in production, so I can't do too much other than a restart and an
occasional strace.  I cannot reproduce in lab conditions.

I've just upgraded 2 of my sensors from 2.6.4 to 3.0.1, and this is
happening on one only, but its the more heavily loaded one.  I'm hoping to
resolve this before upgrading the remaining 10 sensors, as I don't want to
see it on others...
--
Pete
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/zeek/attachments/20200220/c9304a72/attachment.html