[Bro-Dev] #408: Unique connection ID for bro
Bro Tracker
bro at tracker.icir.org
Mon Mar 7 15:53:21 PST 2011
#408: Unique connection ID for bro
-----------------------------+------------------------
Reporter: gregor | Owner:
Type: Feature Request | Status: new
Priority: Normal | Milestone:
Component: Bro | Version: git/master
Keywords: |
-----------------------------+------------------------
{{{
#!rst
This is a summary of a mail thread on bro-dev and be worth considering as
part of the work on the new logging framework.
I was wondering whether it would make sense to assign each connection an
ID that's unique for this bro run. This ID can just be a 64-bit counter
that gets incremented on every new connection.
Why: If we add this ID to log outputs, it would be much easier to
correlate activity across logs (e.g., find the connection in http.log,
alarm.log, and conn.log, without having to match 5-tuples and timestamps).
Other use cases:
* I want to count the number of HTTP request per connection
* I do per connection stats (e.g., number of packets, number of
bytes, retransmissions, RTTs), store them in their own log files
and then want to correlate with the conn.log or the http.log
* Easier debugging / analysis:
I can just grep for the connectionID, instead of
having to map between different connection formattings (e.g.,
notices have origIP:origPort -> respIP:respPort but when I want
to grep for them in conn.log, I have to do some awk to get there)
I think this would be a rather nice (and very easy to implement) feature.
Ideally these ID should be **unique across bro runs**. Otherwise crunching
information from a big log archive woulnd't be much better than it
is today. But that might mean we'd need to go beyond
64-bit integers, perhaps to a string prefixed with something likely
to be unique.
Ideas on how to achieve uniqueness across Bro runs:
We can probably keep a 64 bit counter internally and also add a
bro_instance_ID, that's globally unique across Bro runs. For logging, we
can then log the 64 bit counter and the instance_ID, or concatenate the
two (I would guess that the instance_ID will be handy in other situations
too). Doesn't the cluster already have/need something like that?
In order to generate such an instance_ID, we could:
* make sure it's truly globally unique, e.g., by using a
cryptographically secure, long (128 bit, maybe even 160 or more)
random number. Possibly from an entropy pool (can we use OpenSSL
for that?)
* the user supplies a "hostID", we can then add time and PID
and hash all that together to get the instance ID, e.g.,
md5(hostID + PID + gettimeofday())
(this should probably be fairly tolerant even if the hostID gets
reused across machines).
Alternatively, to save memory, we could hash the run_ID and the connection
counter into a single 64bit number.
It might be nice to be able to keep the run_ID part and the counter
separate. E.g., assign 32 bit to the run_ID and the other 32 bit to the
connection counter and then concatenate them to make up the 64 bit unique
connection ID.
}}}
--
Ticket URL: <http://tracker.icir.org/bro/ticket/408>
Bro Tracker <http://tracker.icir.org/bro>
Bro Issue Tracker
More information about the bro-dev
mailing list