[Bro] File extraction after checking hash.

fatema bannatwala fatema.bannatwala at gmail.com
Tue Oct 4 07:42:59 PDT 2016

I think following could be used to some extent for crude analyses of the
file on wire (please correct me if m wrong):

event: file_extraction_limit
Type: event (f: fa_file, args: any, limit: count, len: count)
Desc: This event is generated when a file extraction analyzer is about to
exceed the maximum permitted file size allowed by the extract_limit field
of Files::AnalyzerArgs. The analyzer is automatically removed from file f.
Type:function (f: fa_file, tag: Files::Tag, args: Files::AnalyzerArgs
&default =[chunk_event=<uninitialized>,
extract_limit=104857600] &optional) :bool

Type:function (f: fa_file) : bool
Stops/ignores any further analysis of a given file.

On Tue, Oct 4, 2016 at 10:33 AM, erik clark <philosnef at gmail.com> wrote:

> Hm, good point. Is there somewhere in the analysis framework where you can
> say, if a file is above x bytes, kill the analysis process? I ask, because
> I see this as somewhat related to the gridftp problem at lbl. If we have
> large tarballs or zip files or whatever crossing the wire, killing those
> off at say, a 5 gig point or so, seems reasonable. As you mentioned that is
> quite a lot of memory being consumed by extraction. :/
> On Tue, Oct 4, 2016 at 10:21 AM, Seth Hall <seth at icir.org> wrote:
>> > On Oct 4, 2016, at 8:47 AM, erik clark <philosnef at gmail.com> wrote:
>> >
>> > Can't you simply write a script that calls file extract at a later
>> date? I would think to hook it into file intel which runs after the file
>> analysis (its comparing hashes) and extract at that point, not before...
>> I've been thinking about some potential directions we could go that might
>> open the door to doing this in some cases for the next release, but for now
>> imagine that your file is 10G.  We can't keep that much data in memory but
>> you don't know the file hash until you've seen every byte of that file.
>> You can't choose to extract the file at the end because all of the content
>> for that file is already gone.  You'd have to extract it up front and make
>> the decision to keep it or delete it after the fact.
>>   .Seth
>> --
>> Seth Hall
>> International Computer Science Institute
>> (Bro) because everyone has a network
>> http://www.bro.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ICSI.Berkeley.EDU/pipermail/bro/attachments/20161004/8b89275d/attachment.html 

More information about the Bro mailing list