[Bro-Dev] [JIRA] (BIT-1143) Investigate replacing libmagic w/ signatures for file identificaiton

Jon Siwek (JIRA) jira at bro-tracker.atlassian.net
Fri Feb 21 07:57:37 PST 2014


    [ https://bro-tracker.atlassian.net/browse/BIT-1143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15577#comment-15577 ] 

Jon Siwek commented on BIT-1143:
--------------------------------

{quote}
Can this be (semi-)automated, i.e., converting the magic mime db into
Bro regular expressions?
{quote}

That's the first approach I'd like to try.  Either w/ a script to parse the mime db files or by instrumenting libmagic itself to dump Bro sigs.

{quote}
Also, we should investigate performance: Bro's signature engine
doesn't have a reputation for being the fastest in the world.  Hard
to predict how it performs compared to libmagic; but then I also don't
know if it mattered much if the file type detection got slower.
{quote}

Yeah; I planned to measure.  Hopefully it's better: at a glance libmagic's matching process looked iterative so I think perf will degrade w/ number of rules; Bro's signature engine differs in that regard, right?

If it's worse, then I think at least it will be worse by a predictable/consistent amount rather than being bound to libmagic's performance characteristics (which seems can vary between libmagic library releases as well as depending on the magic db content).

{quote}
One more caveat, something I actually didn't think about so far: the
signature engine has some depenedencies on connection state, not sure
if using files as the analysis units goes without pain.
{quote}

Yeah, the signatures are coupled with connection/analyzers (think that's why we punted on this idea last year).  Currently, looking in to how much pain it is to wedge a different form of file signatures in to it.  I think I understand how it might be done, though it's a bit hacky (in the sense the signature engine wasn't originally designed to accommodate this type of usage), and might be doing extra stuff that's not really necessary for "disconnected" matching.

Rather than wedge it in to existing "signature" engine, another idea would be to create a new "magic" engine that parallels it.  Though, I'd expect there's definitely some aspects of the signature engine that would be better to re-use existing code rather than re-invent/copy.

> Investigate replacing libmagic w/ signatures for file identificaiton
> --------------------------------------------------------------------
>
>                 Key: BIT-1143
>                 URL: https://bro-tracker.atlassian.net/browse/BIT-1143
>             Project: Bro Issue Tracker
>          Issue Type: New Feature
>          Components: Bro
>    Affects Versions: git/master
>            Reporter: Jon Siwek
>            Assignee: Jon Siwek
>             Fix For: 2.3
>
>
> I think it makes sense to try to make the switch from libmagic to using Bro's own signature engine for file identification before the next release.  Don't want people getting used to magic file format for their own custom file identification rules.



--
This message was sent by Atlassian JIRA
(v6.2-OD-09-036#6252)


More information about the bro-dev mailing list