[Bro] Questions about Bro Capabilities
Nicholas Weaver
nweaver at ICSI.Berkeley.EDU
Wed Oct 3 08:26:19 PDT 2007
On Wed, Oct 03, 2007 at 11:14:45AM -0400, Reed Porada composed:
>
> On Oct 3, 2007, at 10:36 AM, Nicholas Weaver wrote:
>
> >
> >On Wed, Oct 03, 2007 at 10:04:33AM -0400, Reed Porada composed:
> >
> >>I am working on a Traffic Generator (TG) project. Our TG has static
> >>content for webpages and fileshares. In addition, we know when our
> >>TG hosts attempt to access that data. Given those to things, I want
> >>to be able to take a network capture, run it through a system and
> >>separate out traffic that we know our TG generated, by correlating
> >>intent and traffic content, and other traffic on the network. The
> >>end goal being smaller and more relevant network captures for an
> >>analyst. In order to do this I want to try and leverage others
> >>protocol analyzers and parsers. Bro seems to be a good choice as I
> >>believe through a policy and some pregenerated variables (based on
> >>the content and host intent) I can validate given traffic to be from
> >>our TG system, and leave the rest for others to analyze. I believe
> >>that in order to do this I need to get out of Bro the relevant
> >>packets, either packet number or timestamp. Given that information,
> >>I would be able to run it through a script that would split the pcap
> >>based on the output. The added benefit of Bro is that it does some
> >>additional analysis that could be useful for capture analysis.
> >
> >What exactly are the defining characteristics of your synthetic
> >traffic?
>
> Our synthetic traffic is not any different than if a normal user was
> on a machine generating the traffic. Meaning that we use IE to
> navigate to a page, and we use Windows File Browsing to look at
> network file shares. Our TG is designed to be run on an isolated
> network, ala DETER, thus we setup a simulated internet, and other
> simulated networks. Since we are creating these networks, we control
> server content, IP addresses, and host-names. The belief that we
> have is that since we know what our content is (i.e. what is at a
> given website, or on a given file share) and we know when we tried to
> access the given data (we have our host agents log intent), that we
> can separate out our TG traffic. In theory there is no defining
> characteristic of our synthetic traffic in the packet captures that
> we could make Bro or really any other packet analyzer look for,
> basically we do not set the evil bit. However, with the additional
> knowledge of what the content is, and what a synthetic user was
> doing, we believe we can find our traffic. After looking at the
> variables and other things that Bro policy language has, I believe I
> can construct the lookup tables for host_agent_events and
> web_content. Therefore, I believe that I can create a policy script
> to "find" our traffic. What I am not sure is that from the policy I
> can provide the information necessary to get our traffic out of the
> capture, i.e make a smaller capture with just the non-TG traffic.
One thought:
For offline processing, do a two-pass approach. In the first pass,
you use Bro to find the TG flows based on the higher-level attributes,
and write out the flow IDs. For the second pass, only capture the
flows which don't correspond.
--
Nicholas C. Weaver nweaver at icsi.berkeley.edu
This message has been ROT-13 encrypted twice for higher security.
More information about the Bro
mailing list