[Bro-Dev] [JIRA] (BIT-1140) Bloomfilter hashing problem

Aashish Sharma (JIRA) jira at bro-tracker.atlassian.net
Tue Apr 1 13:13:08 PDT 2014


    [ https://bro-tracker.atlassian.net/browse/BIT-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16010#comment-16010 ] 

Aashish Sharma commented on BIT-1140:
-------------------------------------

Matthias, 

I have created two simple test files. Both of these files add a bunch of URL's to a bloomfilter. 

Then, scripts do a bloomfilter_lookup on a *different* set of URLs. 

You should notice two problems
1) URLs which aren't even added to the filter show up as in the filter ( bloomfilter_lookup returns 1) 
2) Return 1 is inconsistent on multiple runs  (sometimes it shows 0, sometimes 1) 

The URLs' added are from in smtp extracted URLs while URLs looked up are in http stream.  Basically, I am making a bloomfilter for all the URLs extracted from emails and then testing against HTTP to see if any of smtp URLs "has been clicked".  (Currently I use a table which gives me correct results but with a much bigger memory footprint)

With boomfilter, we see quite a bit of false positives. 

Here are two examples: 

1) bloom-test-short.bro  - only does lookup for 4 URLs. on repeated run (bro ./bloom-test-short.bro ) you should see different outputs on hits (0 - miss, 1 hit) and the URLs we are looking up aren't added to the filter. 
2) bloom-test2.bro  - Has much more extensive Lookup set. On a run you should see the lookup results as 0 or 1 and it varies. Again all the lookup URLs are different from the ones added. 

Please let me know if you have problems reproducing this. I can send you the actual smtp-embedded-url.bro scripts as well. 




> Bloomfilter hashing problem
> ---------------------------
>
>                 Key: BIT-1140
>                 URL: https://bro-tracker.atlassian.net/browse/BIT-1140
>             Project: Bro Issue Tracker
>          Issue Type: Problem
>          Components: Bro
>            Reporter: Robin Sommer
>            Assignee: Matthias Vallentin
>             Fix For: 2.3
>
>         Attachments: bloom-test2.bro, bloom-test-short.bro
>
>
> It seems bloomfilter hashing isn't working correctly. Has that been confirmed? Is there a fix?



--
This message was sent by Atlassian JIRA
(v6.3-OD-01-067#6307)


More information about the bro-dev mailing list