[Bro] Help with searching logs

Schoenefeld, Keith P. Keith_Schoenefeld at baylor.edu
Wed Apr 3 14:58:26 PDT 2013


This may be unrelated (or even something I'm doing terribly wrong), but I noticed what I think is a similar issue when working with the capture-loss script that led me to believe that the underlying code in Bro may not be properly dealing with it when the 32 bit sequence number in a TCP packet rolls over.  I spoke briefly to Seth about it, and he asked me to provide the .bro code I was using as well as a pcap demonstrating my concerns -- I failed to do either.  The modified capture_loss script I'm running results in:

#separator \x09
#set_separator  ,
#empty_field    (empty)
#unset_field    -
#path   capture_loss
#open   2013-04-03-21-00-08
#fields ts      ts_delta        peer    gaps    acks    percent_lost    gap_bytes       ack_bytes       percent_bytes_lost
#types  time    interval        string  count   count   string  count   count   string
1365022809.033023       60.000027       worker-4-4      3       170598  0.002   4239898009      4652853950      91.125
1365022809.059350       60.000047       worker-1-1      0       129786  0.000   0       301416506       0.000
1365022808.638408       60.000054       worker-2-1      1       143234  0.001   1611842887      1956581780      82.381
1365022808.885606       60.000046       worker-3-4      1       144122  0.001   1350348978      1680367485      80.360
1365022809.328350       60.000060       worker-3-3      0       148226  0.000   0       364927843       0.000
1365022809.292290       60.000081       worker-4-2      1       118706  0.001   1609627515      1900369040      84.701

I'll try to throw something together and get it submitted.

-- KS

Keith Schoenefeld
Information Security Analyst
Baylor University
254-710-6667


-----Original Message-----
From: bro-bounces at bro.org [mailto:bro-bounces at bro.org] On Behalf Of Castle, Shane
Sent: Wednesday, April 03, 2013 3:48 PM
To: 'Michael Bower'
Cc: 'Bro Mailing List'
Subject: Re: [Bro] Help with searching logs

No, not really. I discovered that Bro is not reporting the bytes correctly. That makes these numbers totally suspect. I wonder what's going on there.

scastle at nsm:~/scripts$ zcat /nsm/bro/logs/2013-04-03/conn.19:00:00-20:00:00.log.gz | bro-cut -d ts id.orig_h id.resp_h orig_bytes resp_bytes | grep 192.168.56.166.23.7.65.224
2013-04-03T19:21:28+0000        192.168.56.166  23.7.65.224     0       1200242775
2013-04-03T19:21:28+0000        192.168.56.166  23.7.65.224     0       1214734459
2013-04-03T19:21:33+0000        192.168.56.166  23.7.65.224     540     3920
2013-04-03T19:21:51+0000        192.168.56.166  23.7.65.224     0       1214734460
2013-04-03T19:21:40+0000        192.168.56.166  23.7.65.224     0       1200242776
2013-04-03T19:21:28+0000        192.168.56.166  23.7.65.224     1112    1200442670
2013-04-03T19:21:33+0000        192.168.56.166  23.7.65.224     1270    4325
2013-04-03T19:24:42+0000        192.168.56.166  23.7.65.224     0       0
2013-04-03T19:24:41+0000        192.168.56.166  23.7.65.224     0       0
2013-04-03T19:24:46+0000        192.168.56.166  23.7.65.224     572     3920
2013-04-03T19:24:47+0000        192.168.56.166  23.7.65.224     572     3920
2013-04-03T19:24:42+0000        192.168.56.166  23.7.65.224     1105    405
2013-04-03T19:24:41+0000        192.168.56.166  23.7.65.224     504     26430
2013-04-03T19:21:28+0000        192.168.56.166  23.7.65.224     0       3067067778

Looks like resp_bytes is not being properly shown sometimes. Hmm, missed_bytes seems to be large here, too. Sigh - I still don't know what's going on. If missed_bytes is nonzero, the orig and resp bytes can't be trusted. More work and research.

Here's the unfiltered output:
scastle at nsm:~/scripts$ zcat /nsm/bro/logs/2013-04-03/conn.19:00:00-20:00:00.log.gz | bro-cut -d | grep 192.168.56.166.\*23.7.65.224
2013-04-03T19:21:28+0000        YTOQlbbx1Pg     192.168.56.166  64878   23.7.65.224     80      tcp     -       6.655300        0       1200242775      SF      T       1200242775      ShhhhAFf        14      788     8       472     (empty)
2013-04-03T19:21:28+0000        bLfZjXnY29h     192.168.56.166  64888   23.7.65.224     80      tcp     -       6.284761        0       1214734459      SF      T       1214734459      ShhhAhFfr       15      840     10      572     (empty)
2013-04-03T19:21:33+0000        zs1xf4L3iZ6     192.168.56.166  64906   23.7.65.224     443     tcp     ssl     5.754741        540     3920    SF      T       0       ShADadFfRR      16      1372    12      4568    (empty)
2013-04-03T19:21:51+0000        ramah8M2Oc1     192.168.56.166  64888   23.7.65.224     80      tcp     -       24.008354       0       1214734460      RSTR    T       0       hArR    2       92      3       160     (empty)
2013-04-03T19:21:40+0000        JObupHqiDjl     192.168.56.166  64878   23.7.65.224     80      tcp     -       36.214781       0       1200242776      RSTR    T       0       hArR    3       144     5       260     (empty)
2013-04-03T19:21:28+0000        XkrirkTDGzk     192.168.56.166  64876   23.7.65.224     80      tcp     http    113.531073      1112    1200442670      SF      T       1200442265      ShhhADadhfF     20      2212    15      1273    (empty)
2013-04-03T19:21:33+0000        PtKQpijzTff     192.168.56.166  64905   23.7.65.224     443     tcp     ssl     116.851227      1270    4325    SF      T       0       ShADadFfRR      18      2206    13      5025    (empty)
2013-04-03T19:24:42+0000        tAwhU1nQ8e8     192.168.56.166  65044   23.7.65.224     80      tcp     -       7.372079        0       0       SF      T       0       ShAFfar 6       324     5       268     (empty)
2013-04-03T19:24:41+0000        wWH8h7RtYg6     192.168.56.166  65003   23.7.65.224     80      tcp     -       8.491360        0       0       SF      T       0       ShAFfar 6       324     5       268     (empty)
2013-04-03T19:24:46+0000        lsIG9TsJ0Ti     192.168.56.166  65081   23.7.65.224     443     tcp     ssl     6.057423        572     3920    SF      T       0       ShADadFfR       18      1532    12      4584    (empty)
2013-04-03T19:24:47+0000        TXZBimJtLN6     192.168.56.166  65083   23.7.65.224     443     tcp     ssl     5.903914        572     3920    SF      T       0       ShADadFfR       18      1532    12      4584    (empty)
2013-04-03T19:24:42+0000        qtKgUBFlRNa     192.168.56.166  65041   23.7.65.224     80      tcp     http    96.772052       1105    405     SF      T       0       ShADadfF        8       1533    7       1587    (empty)
2013-04-03T19:24:41+0000        KmKMWDZ7Kk2     192.168.56.166  65001   23.7.65.224     80      tcp     http    104.511714      504     26430   SF      T       0       ShADadfF        18      1452    24      29134   (empty)
2013-04-03T19:21:28+0000        R4eSKC00lqe     192.168.56.166  64879   23.7.65.224     80      tcp     -       46.186418       0       3067067778      SF      T       0       ShhAFfarR       17      932     15      872     (empty)

-- 
Shane Castle
Data Security Mgr, Boulder County IT


-----Original Message-----
From: bro-bounces at bro.org [mailto:bro-bounces at bro.org] On Behalf Of Castle, Shane
Sent: Wednesday, April 03, 2013 14:15
To: 'Michael Bower'
Cc: 'Bro Mailing List'
Subject: Re: [Bro] Help with searching logs

Just for grins, I worked on the conversation angle a bit. Here's what I came up with as a first hack.

$ bro-cut id.orig_h id.resp_h orig_bytes resp_bytes </nsm/bro/logs/current/conn.log | awk -f convlist.awk | sort -rnk 2 | head -20

convlist.awk:
{
tot=0;
if ($3 != "-") {tot=$3};
if ($4 != "-") {tot+=$4};
print $1".."$2, tot;
}

Output:
166.213.168.9..192.168.11.30 6369575986
166.213.168.8..192.168.11.30 4304514612
192.168.61.21..193.120.199.16 4294967294
70.208.5.91..192.168.13.45 3532074861
192.168.56.166..23.7.65.224 3067067778
166.213.168.22..192.168.11.30 1981760165
174.232.193.204..172.31.251.32 1934543198
166.213.168.22..192.168.11.30 1853700164
166.213.168.22..192.168.11.30 1848753206
166.213.168.22..192.168.11.30 1757683563
166.213.168.22..192.168.11.30 1706006024
166.213.168.22..192.168.11.30 1657117082
166.213.168.22..192.168.11.30 1612028622
166.213.168.22..192.168.11.30 1607600859
166.213.168.22..192.168.11.30 1584767941
166.213.168.22..192.168.11.30 1543389093
166.213.168.22..192.168.11.30 1533513278
166.213.168.22..192.168.11.30 1467372828
166.213.168.22..192.168.11.30 1254028676
192.168.56.166..23.7.65.224 1214734460

I used ".." to join the two IP addresses together so if I wanted to I could extract and sort without changing the original sort command very much. Essentially, this shows the top 20 separate conversations since the current conn.log was started. Notice that several of them have the same endpoints. If I start summarizing using "sort -t '.' <etc>", then I get

$ bro-cut id.orig_h id.resp_h orig_bytes resp_bytes </nsm/bro/logs/current/conn.log | awk -f convlist.awk | sort -t '.' -k 1,1n -k 2,2n -k 3,3n -k 4,4n -k 6,6n -k 7,7n -k 8,8n -k 9,9n | awk -f toptalk.awk | sort -rnk 2 | head -20
166.213.168.22..192.168.11.30 45782918899
166.213.168.9..192.168.11.30 14376535154
192.168.56.166..23.7.65.224 9097513513
172.31.251.13..218.30.26.68 5826064097
172.31.251.171..192.168.14.81 4321601956
166.213.168.8..192.168.11.30 4304514612
192.168.61.21..193.120.199.16 4294967294
70.208.5.91..192.168.13.45 3532111771
192.168.21.103..23.7.65.224 3059490250
166.213.168.30..192.168.11.30 2285013919
166.137.181.202..192.168.13.45 2083142285
174.232.193.204..172.31.251.32 1934845607
172.31.251.13..218.30.26.70 1198383801
192.168.226.96..72.21.81.253 1175467069
161.97.154.10..172.31.251.32 1137167682
174.239.96.134..192.168.13.45 920925068
192.168.171.25..204.107.64.58 488202352
172.31.251.71..192.168.12.116 477085260
192.168.6.97..74.125.225.161 459930229
172.31.251.13..192.168.12.73 412551956

So, it can be done. Interesting exercise. Note that the "toptalk.awk" script referenced in the command is simply the awk script from before (see below in this email thread), created as a separate file.

Just so you know, the third line from the top in the output is to a Twitter address. I need to find out wtf is going on there. 9GB from/to Twitter? Really?

-- 
Shane Castle
Data Security Mgr, Boulder County IT

-----Original Message-----
From: Michael Bower [mailto:mbower2 at gmail.com] 
Sent: Wednesday, April 03, 2013 09:35
To: Castle, Shane
Cc: Bro Mailing List
Subject: RE: [Bro] Help with searching logs

Thanks! This will give me something to go on.

On Apr 3, 2013 11:30 AM, "Castle, Shane" <scastle at bouldercounty.org> wrote:


	Seems as though an awk update tightened some of the syntax restrictions. This script works:
	
	bro-cut id.orig_h orig_bytes < conn.log             \
	    | sort -t '.' -k 1,1n -k 2,2n -k 3,3n -k 4,4n   \
	    | awk 'BEGIN { size=0;host="" }                 \
	           { if (host != $1) {                      \
	                 if (size != 0)                     \
	                     print host, size;              \
	                  host=$1;                          \
	                  if ($2 != "-") {                  \
	                     size=$2 }                      \
	                  else {                            \
	                     size=0; }                      \
	              } else {                              \
	                  if ($2 != "-")                    \
	                     size += $2 }                   \
	            }                                       \
	            END {                                   \
	                if (size != 0)                      \
	                     print host, size               \
	                }'                                  \
	    | sort -rnk 2                                   \
	    | head -n 10
	
	The conversations could be done by a script that takes conn.log as input, merges orig_h and resp_h into one field while adding together their bytes, sorting on that, then putting the result through the same awk script. The output might look something like this:
	
	1.2.3.4|5.6.7.8 123456789
	...
	
	depending on what you choose for your conjoining character. Also, note that if there are multiple conversations between two IP addresses then you will be adding up all those conversations and presenting the result as one line - that might not be exactly what you are looking for.
	
	Welcome to the world of IDS data mining. Sharpen your awk/sort/sql/perl/bash skills - they will come in very handy.
	
	--
	Shane Castle
	Data Security Mgr, Boulder County IT
	
	
	-----Original Message-----
	From: bro-bounces at bro.org [mailto:bro-bounces at bro.org] On Behalf Of Castle, Shane
	Sent: Wednesday, April 03, 2013 08:59
	To: 'Michael Bower'; 'bro at bro.org'
	Subject: Re: [Bro] Help with searching logs
	
	Hm, I get a syntax error in that script now. Let me figure this out...
	
	--
	Shane Castle
	Data Security Mgr, Boulder County IT
	
	
	-----Original Message-----
	From: bro-bounces at bro.org [mailto:bro-bounces at bro.org] On Behalf Of Castle, Shane
	Sent: Wednesday, April 03, 2013 08:31
	To: 'Michael Bower'; 'bro at bro.org'
	Subject: Re: [Bro] Help with searching logs
	
	The script is lying to you. Here's the correct script:
	
	bro-cut id.orig_h orig_bytes < conn.log             \
	    | sort -t '.' -k 1,1n -k 2,2n -k 3,3n -k 4,4n   \
	    | awk 'BEGIN { size=0;host="" }                 \
	           { if (host != $1) {                      \
	                 if (size != 0)                     \
	                     print host, size;              \
	                  host=$1;                          \
	                  if ($2 != "-")                    \
	                     size=$2                        \
	                  else                              \
	                     size=0                         \
	              } else                                \
	                  if ($2 != "-")                    \
	                     size += $2                     \
	            }                                       \
	            END {                                   \
	                if (size != 0)                      \
	                     print host, size               \
	                }'                                  \
	    | sort -rnk 2                                   \
	    | head -n 10
	
	Since this script summarizes, having timestamps in there would not be useful. And, if you want to change the logic to responder, change "orig" to "resp" in the first line. Yes, it would be nice to have the top conversations, not just the top talkers, which would combine both orig and resp, but I'm not sure the result would justify the work.
	
	A few months ago I went through this and we were hoping that the doc would have been changed to show a correct script, but it has not been, apparently.
	
	--
	Shane Castle
	Data Security Mgr, Boulder County IT
	
	-----Original Message-----
	From: bro-bounces at bro.org [mailto:bro-bounces at bro.org] On Behalf Of Michael Bower
	Sent: Tuesday, April 02, 2013 18:19
	To: bro at bro.org
	Subject: [Bro] Help with searching logs
	
	-----BEGIN PGP SIGNED MESSAGE-----
	Hash: SHA512
	
	
	Im still learning, so bare with me.  I ran the following command:
	
	bro-cut id.orig_h orig_bytes < conn.log             \
	    | sort                                          \
	    | awk '{ if (host != $1) {                      \
	                 if (size != 0)                     \
	                     print $1, size;                \
	                  host=$1;                          \
	                  size=0                            \
	              } else                                \
	                  size += $2                        \
	            }                                       \
	            END {                                   \
	                if (size != 0)                      \
	                     print $1, size                 \
	                }'                                  \
	    | sort -rnk 2                                     \
	    | head -n 10
	
	
	This worked well to show me the top 10 hosts (originators).  What Im
	trying to do is show the top 10 hosts and the time (ts).  Maybe show
	the resp_bytes field too, if that is possible.  Any help would be
	greatly appreciated.
	
	Thanks!
	- --
	
	Mike
	
	
	-----BEGIN PGP SIGNATURE-----
	Comment: GPGTools - http://gpgtools.org
	
	iQEcBAEBCgAGBQJRW3WBAAoJEIAKCPjZh/yXUF4H/RhFuVQy6bT3Z8Z1k2oMDBGt
	TYFAfsyeXcnf9dOl3NFGEIlifjDMZ/gK5kBVWo/FYSHGWHrCT0+ICcsjwLroRP/E
	rn1StPS7ozlSiY2ZJSG0UAUCZX0HZ0ujvmNo8UvnoAR75cORq8Y08cU4XpLjqhxc
	d4xu3G+HnhyzjKAiG6xtqDpK2Z3bnjJzyWEqZCoYDzNqtcYnrxYjcKa0kX9rBhUr
	uV6upZ9OHIdf25EYCVfjDrKPSUaRhSAnTVtYE0+OQRA0OPpnG3rLWFSK2yjkTbNG
	AzKXfhJZ0PWmUWkeD6Bzf2TCNqfyLchNSScm2atA/dhTRBV3JhHIhwIcejXr6sk=
	=23Kd
	-----END PGP SIGNATURE-----
	_______________________________________________
	Bro mailing list
	bro at bro-ids.org
	http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro
	
	_______________________________________________
	Bro mailing list
	bro at bro-ids.org
	http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro
	
	_______________________________________________
	Bro mailing list
	bro at bro-ids.org
	http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro
	


_______________________________________________
Bro mailing list
bro at bro-ids.org
http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro

_______________________________________________
Bro mailing list
bro at bro-ids.org
http://mailman.ICSI.Berkeley.EDU/mailman/listinfo/bro




More information about the Bro mailing list