[Zeek] File extraction package

Justin Azoff justin at corelight.com
Wed Apr 29 10:02:53 PDT 2020

On Mon, Apr 27, 2020 at 5:10 PM Kayode Enwerem
<Kayode_Enwerem at ao.uscourts.gov> wrote:
> Hello,
> We are trying to do some customization to the file extraction package https://github.com/hosom/file-extraction
> Does any one have any suggestions on how I can get any of these done?
> Is there a way to define what network you want the “file extracting package” to extract the files from? Instead of extracting files from all the networks defined in network.cfg. Example: if I have 7 subnets defined in network.cfg  but I only the file extracting package to extract files from 2 out of the 7.

yes, just make a set[subnet] and add the networks you want to it.  the
networks.cfg just auto generates one for you called Site::local_nets

> Is there a way to dedup the extracted files. Example: If a file was sent to 20 people, I only want to see the file 1 time instead of 20 times.

easiest way to do this part is to just name the file the hash, but you
could track recent files with a set[string].

> We would also like to exclude certain file types based coming via SMB. Example: excluding all .pdf files I just want to exclude .pdf files coming via SMB.

If you look at how the plugins in that package are written, they are
just  small scripts containing an if statement:


so you would just need something  like

const pdf_types: set[string] = { "application/pdf" };

hook FileExtraction::extract(f: fa_file, meta: fa_metadata) &priority=5
    if ( f$source != "SMB" && meta$mime_type in pdf_types )

or keep extracting all pdfs and ignore the ones that come from smb.

hook FileExtraction::ignore(f: fa_file, meta: fa_metadata)
    if ( f$source == "SMB" && meta$mime_type in pdf_types )


More information about the Zeek mailing list