[Bro] BRO Logger crashing due to large DNS log files
Ron McClellan
Ron_McClellan at ao.uscourts.gov
Mon Aug 20 11:04:54 PDT 2018
Update:
Worked for almost 3 hours, but then started failing again. I even changed the log rotation to every 15 minutes and it still crashes . Any other sugestions? Has anyone ever tried to configured syslog-ng to handle the logging?
Warning: broctl config has changed (run the broctl "deploy" command)
Name Type Host Status Pid Started
logger logger localhost terminating 28295 20 Aug 12:30:03
manager manager localhost running 28336 20 Aug 12:30:05
proxy-1 proxy localhost running 28375 20 Aug 12:30:06
worker-1-1 worker localhost running 28565 20 Aug 12:30:08
Thanks,
Ron
-----Original Message-----
From: bro-bounces at bro.org <bro-bounces at bro.org> On Behalf Of Ron McClellan
Sent: Monday, August 20, 2018 11:48 AM
To: Azoff, Justin S <jazoff at illinois.edu>
Cc: bro at bro.org
Subject: Re: [Bro] BRO Logger crashing due to large DNS log files
Justin,
Thanks, I turned off compression and so for 2+ hours, everything is working well. I kinda had an idea it was related to the compression, but thought the pigz replacement would take care of that, guess not. Appreciate the help. Will let everyone know how it goes over the long term. I think you and Chris hit the nail on the head about the weird logs. I haven't really started tuning much, wanted to get the system nice and stable first and then start tuning and looking at the weird stuff, which is heavy DNS.
Thanks Again,
Ron
[root@ current]# cat weird.log | bro-cut name|sort|uniq -c|sort -rn
34264380 dns_unmatched_msg
16696030 dns_unmatched_reply
330912 DNS_RR_unknown_type
62288 possible_split_routing
59512 data_before_established
38396 NUL_in_line
21210 inappropriate_FIN
21209 line_terminated_with_single_CR
18978 DNS_RR_length_mismatch
1852 bad_TCP_checksum
1060 dnp3_corrupt_header_checksum
922 truncated_tcp_payload
326 dnp3_header_lacks_magic
230 DNS_truncated_RR_rdlength_lt_len
92 non_ip_packet_in_ethernet
92 above_hole_data_without_any_acks
48 SYN_seq_jump
46 window_recision
46 dns_unmatched_msg_quantity
46 DNS_truncated_ans_too_short
46 DNS_RR_bad_length
46 DNS_Conn_count_too_large
46 ayiya_tunnel_non_ip
-----Original Message-----
From: Azoff, Justin S <jazoff at illinois.edu>
Sent: Monday, August 20, 2018 10:31 AM
To: Ron McClellan <Ron_McClellan at ao.uscourts.gov>
Cc: bro at bro.org
Subject: Re: [Bro] BRO Logger crashing due to large DNS log files
> On Aug 19, 2018, at 11:12 AM, Ron McClellan <Ron_McClellan at ao.uscourts.gov> wrote:
>
> All,
>
> Having an issue with the bro logger crashing due to large volumes of DNS log traffic, 20-30GB an hour.
Is it actually crashing? Are you getting a crash report at all? From the filenames you listed it looks more like log rotation is failing.
> This is completely a local configuration, on a system with super-fast flash storage, 64 cores, 256GB RAM running BRO 2.5.4. If I disable DNS logging, everything works fine without issue. When I enable it, I get the results below. I thought it might be an issue with gzipping the old logs, so I replaced the standard gzip with pigz and I can manually compress the 30+ gig files in seconds, so don’t think that is the issue.
It could be related to the gzipping. The way log rotation works is not great.. all log files get compressed at the same time which can cause some thrashing.
If you set
compresslogs = 0
in broctl.cfg so that broctl does not gzip the logs at all, does the problem go away?
You could do something like that, and then run a script like:
while true; do
for f in /usr/local/bro/logs/201*/*.log ; do
gzip $f
done
sleep 60
done
to compress the logs in the background serially.
Another thing to keep an eye on is if your logger is able to keep up with the volume of data. This script is a plugin for munin, but you can run it directly:
#!/usr/bin/env python
import os
import sys
import time
DEFAULT_LOG = "/usr/local/bro/logs/current/dns.log"
def config():
print """
graph_category network
graph_title Bro log lag
graph_vlabel lag
graph_args --base 1000 --vertical-label seconds --lower-limit 0 graph_info The bro log lag
lag.label lag
lag.info log message lag in seconds
lag.min 0
lag.warning 0:15
lag.critical 0:60
""".strip()
return 0
def get_latest_time(fn):
f = open(fn)
f.seek(-4096, os.SEEK_END)
end = f.read().splitlines()[1:-1] #ignore possibly incomplete first and last lines
times = [line.split()[0] for line in end]
timestamps = map(float, times)
latest = max(timestamps)
return latest
def lag(fn):
lag = 500
for x in range(3):
try :
latest = get_latest_time(fn)
now = time.time()
lag = now - latest
break
except (IOError, ValueError):
#File could be rotating, wait and try again
time.sleep(5)
print "lag.value %f" % lag
if __name__ == "__main__":
filename = os.getenv("BRO_LAG_FILENAME", DEFAULT_LOG)
if sys.argv[1:] and sys.argv[1] == 'config':
config()
else:
lag(filename)
It will output something like
lag.value 2.919352
A normal value should be about 5, anything under 20 is probably ok. If it's 500 and climbing, that's a problem.
Also..
> -rw-r--r--. 1 root root 6.8G Aug 18 12:00 weird-18-08-18_11.00.00.log
> -rw-r--r--. 1 root root 2.5G Aug 18 12:18 weird-18-08-18_12.00.00.log
That's a LOT of weird.log, what's going on there?
—
Justin Azoff
_______________________________________________
Bro mailing list
bro at bro-ids.org
http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro
More information about the Bro
mailing list