[Bro-Dev] [JIRA] (BIT-1140) Bloomfilter hashing problem
Aashish Sharma (JIRA)
jira at bro-tracker.atlassian.net
Tue Apr 1 13:13:08 PDT 2014
[ https://bro-tracker.atlassian.net/browse/BIT-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16010#comment-16010 ]
Aashish Sharma commented on BIT-1140:
-------------------------------------
Matthias,
I have created two simple test files. Both of these files add a bunch of URL's to a bloomfilter.
Then, scripts do a bloomfilter_lookup on a *different* set of URLs.
You should notice two problems
1) URLs which aren't even added to the filter show up as in the filter ( bloomfilter_lookup returns 1)
2) Return 1 is inconsistent on multiple runs (sometimes it shows 0, sometimes 1)
The URLs' added are from in smtp extracted URLs while URLs looked up are in http stream. Basically, I am making a bloomfilter for all the URLs extracted from emails and then testing against HTTP to see if any of smtp URLs "has been clicked". (Currently I use a table which gives me correct results but with a much bigger memory footprint)
With boomfilter, we see quite a bit of false positives.
Here are two examples:
1) bloom-test-short.bro - only does lookup for 4 URLs. on repeated run (bro ./bloom-test-short.bro ) you should see different outputs on hits (0 - miss, 1 hit) and the URLs we are looking up aren't added to the filter.
2) bloom-test2.bro - Has much more extensive Lookup set. On a run you should see the lookup results as 0 or 1 and it varies. Again all the lookup URLs are different from the ones added.
Please let me know if you have problems reproducing this. I can send you the actual smtp-embedded-url.bro scripts as well.
> Bloomfilter hashing problem
> ---------------------------
>
> Key: BIT-1140
> URL: https://bro-tracker.atlassian.net/browse/BIT-1140
> Project: Bro Issue Tracker
> Issue Type: Problem
> Components: Bro
> Reporter: Robin Sommer
> Assignee: Matthias Vallentin
> Fix For: 2.3
>
> Attachments: bloom-test2.bro, bloom-test-short.bro
>
>
> It seems bloomfilter hashing isn't working correctly. Has that been confirmed? Is there a fix?
--
This message was sent by Atlassian JIRA
(v6.3-OD-01-067#6307)
More information about the bro-dev
mailing list