[Zeek] File extraction package

Justin Azoff justin at corelight.com
Wed Apr 29 10:02:53 PDT 2020


On Mon, Apr 27, 2020 at 5:10 PM Kayode Enwerem
<Kayode_Enwerem at ao.uscourts.gov> wrote:
>
> Hello,
>
> We are trying to do some customization to the file extraction package https://github.com/hosom/file-extraction
>
> Does any one have any suggestions on how I can get any of these done?
>
> Is there a way to define what network you want the “file extracting package” to extract the files from? Instead of extracting files from all the networks defined in network.cfg. Example: if I have 7 subnets defined in network.cfg  but I only the file extracting package to extract files from 2 out of the 7.

yes, just make a set[subnet] and add the networks you want to it.  the
networks.cfg just auto generates one for you called Site::local_nets

> Is there a way to dedup the extracted files. Example: If a file was sent to 20 people, I only want to see the file 1 time instead of 20 times.

easiest way to do this part is to just name the file the hash, but you
could track recent files with a set[string].

> We would also like to exclude certain file types based coming via SMB. Example: excluding all .pdf files I just want to exclude .pdf files coming via SMB.

If you look at how the plugins in that package are written, they are
just  small scripts containing an if statement:

https://github.com/hosom/file-extraction/blob/master/scripts/plugins/extract-pdf.zeek

so you would just need something  like

const pdf_types: set[string] = { "application/pdf" };

hook FileExtraction::extract(f: fa_file, meta: fa_metadata) &priority=5
{
    if ( f$source != "SMB" && meta$mime_type in pdf_types )
        break;
}

or keep extracting all pdfs and ignore the ones that come from smb.

hook FileExtraction::ignore(f: fa_file, meta: fa_metadata)
{
    if ( f$source == "SMB" && meta$mime_type in pdf_types )
        break;
}

-- 
Justin



More information about the Zeek mailing list