[Bro-Dev] [JIRA] (BIT-1215) bro-cut should be rewritten in C for speed and to not depend on gawk

Justin Azoff (JIRA) jira at bro-tracker.atlassian.net
Thu Jul 10 15:27:07 PDT 2014

    [ https://bro-tracker.atlassian.net/browse/BIT-1215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107#comment-17107 ] 

Justin Azoff commented on BIT-1215:

I think start with 1M and realloc 2x as needed is the way to go after all.  We need (and already have) the check to see if fgets truncated the line.

I think the only thing to do would be to add an absolute max line length of 64M or something to handle the case where someone accidentally runs bro-cut against a binary file (like a compressed bro log) that just doesn't contain any newlines.

> bro-cut should be rewritten in C for speed and to not depend on gawk
> --------------------------------------------------------------------
>                 Key: BIT-1215
>                 URL: https://bro-tracker.atlassian.net/browse/BIT-1215
>             Project: Bro Issue Tracker
>          Issue Type: Improvement
>          Components: Bro, bro-aux
>            Reporter: Daniel Thayer
>             Fix For: 2.4
> The current implementation of bro-cut is too slow when processing large log files (takes more than a minute to process a single log file a few hundred MB in size).  Justin Azoff rewrote bro-cut in C and found that it runs an order of magnitude faster.  Another benefit of a C version of bro-cut is that we will no longer depend on gawk for anything (and some of Bro's supported platforms do not include gawk by default).

This message was sent by Atlassian JIRA

More information about the bro-dev mailing list