[Bro] BRO Logger crashing due to large DNS log files

Ron McClellan Ron_McClellan at ao.uscourts.gov
Mon Aug 20 11:04:54 PDT 2018


Update:

	Worked for almost 3 hours, but then started failing again.  I even changed the log rotation to every 15 minutes and it still crashes .    Any other sugestions?  Has anyone ever tried to configured syslog-ng to handle the logging?


Warning: broctl config has changed (run the broctl "deploy" command)
Name         Type    Host             Status    Pid    Started
logger       logger  localhost        terminating 28295  20 Aug 12:30:03
manager      manager localhost        running   28336  20 Aug 12:30:05
proxy-1      proxy   localhost        running   28375  20 Aug 12:30:06
worker-1-1   worker  localhost        running   28565  20 Aug 12:30:08


Thanks,

Ron 


-----Original Message-----
From: bro-bounces at bro.org <bro-bounces at bro.org> On Behalf Of Ron McClellan
Sent: Monday, August 20, 2018 11:48 AM
To: Azoff, Justin S <jazoff at illinois.edu>
Cc: bro at bro.org
Subject: Re: [Bro] BRO Logger crashing due to large DNS log files

Justin,

	Thanks, I turned off compression and so for 2+ hours, everything is working well.  I kinda had an idea it was related to the compression, but thought the pigz replacement would take care of that, guess not.  Appreciate the help.  Will let everyone know how it goes over the long term.  I think you and Chris hit the nail on the head about the weird logs.  I haven't really started tuning much, wanted to get the system nice and stable first and then start tuning and looking at the weird stuff, which is heavy DNS.

Thanks Again,

Ron



[root@ current]# cat weird.log | bro-cut name|sort|uniq  -c|sort -rn
34264380 dns_unmatched_msg
16696030 dns_unmatched_reply
 330912 DNS_RR_unknown_type
  62288 possible_split_routing
  59512 data_before_established
  38396 NUL_in_line
  21210 inappropriate_FIN
  21209 line_terminated_with_single_CR
  18978 DNS_RR_length_mismatch
   1852 bad_TCP_checksum
   1060 dnp3_corrupt_header_checksum
    922 truncated_tcp_payload
    326 dnp3_header_lacks_magic
    230 DNS_truncated_RR_rdlength_lt_len
     92 non_ip_packet_in_ethernet
     92 above_hole_data_without_any_acks
     48 SYN_seq_jump
     46 window_recision
     46 dns_unmatched_msg_quantity
     46 DNS_truncated_ans_too_short
     46 DNS_RR_bad_length
     46 DNS_Conn_count_too_large
     46 ayiya_tunnel_non_ip	






-----Original Message-----
From: Azoff, Justin S <jazoff at illinois.edu>
Sent: Monday, August 20, 2018 10:31 AM
To: Ron McClellan <Ron_McClellan at ao.uscourts.gov>
Cc: bro at bro.org
Subject: Re: [Bro] BRO Logger crashing due to large DNS log files

> On Aug 19, 2018, at 11:12 AM, Ron McClellan <Ron_McClellan at ao.uscourts.gov> wrote:
> 
> All,
>  
>                 Having an issue with the bro logger crashing due to large volumes of DNS log traffic, 20-30GB an hour.

Is it actually crashing?  Are you getting a crash report at all?  From the filenames you listed it looks more like log rotation is failing.

>  This is completely a local configuration, on a system with super-fast flash storage, 64 cores, 256GB RAM running BRO 2.5.4.  If I disable DNS logging, everything works fine without issue.  When I enable it, I get the results below.  I thought it might be an issue with gzipping the old logs, so I replaced the standard gzip with pigz and I can manually compress the 30+ gig files in seconds, so don’t think that is the issue.

It could be related to the gzipping.  The way log rotation works is not great.. all log files get compressed at the same time which can cause some thrashing.

If you set

    compresslogs = 0

in broctl.cfg so that broctl does not gzip the logs at all, does the problem go away?

You could do something like that, and then run a script like:

while true; do
    for f in /usr/local/bro/logs/201*/*.log ; do
        gzip $f
    done
    sleep 60
done

to compress the logs in the background serially.

Another thing to keep an eye on is if your logger is able to keep up with the volume of data.  This script is a plugin for munin, but you can run it directly:

#!/usr/bin/env python
import os
import sys
import time

DEFAULT_LOG = "/usr/local/bro/logs/current/dns.log"

def config():
    print """
graph_category network

graph_title Bro log lag
graph_vlabel lag
graph_args --base 1000 --vertical-label seconds --lower-limit 0 graph_info The bro log lag

lag.label lag
lag.info log message lag in seconds
lag.min 0
lag.warning 0:15
lag.critical 0:60
""".strip()

    return 0

def get_latest_time(fn):
    f = open(fn)

    f.seek(-4096, os.SEEK_END)
    end = f.read().splitlines()[1:-1] #ignore possibly incomplete first and last lines
    times = [line.split()[0] for line in end]
    timestamps = map(float, times)
    latest = max(timestamps)
    return latest

def lag(fn):
    lag = 500
    for x in range(3):
        try :
            latest = get_latest_time(fn)
            now = time.time()
            lag = now - latest
            break
        except (IOError, ValueError):
            #File could be rotating, wait and try again
            time.sleep(5)
    print "lag.value %f" % lag

if __name__ == "__main__":

    filename = os.getenv("BRO_LAG_FILENAME", DEFAULT_LOG)

    if sys.argv[1:] and sys.argv[1] == 'config':
        config()
    else:
        lag(filename)

It will output something like

lag.value 2.919352

A normal value should be about 5, anything under 20 is probably ok.  If it's 500 and climbing, that's a problem.

Also..

> -rw-r--r--. 1 root root 6.8G Aug 18 12:00 weird-18-08-18_11.00.00.log 
> -rw-r--r--. 1 root root 2.5G Aug 18 12:18 weird-18-08-18_12.00.00.log

That's a LOT of weird.log, what's going on there?

—
Justin Azoff



_______________________________________________
Bro mailing list
bro at bro-ids.org
http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro



More information about the Bro mailing list