Scott Campbell scampbell at lbl.gov
Mon Jan 22 21:31:28 PST 2018

I have been using the input framework with great success as a tool to 
read and parse structured text logs.  Unfortunately I have reached a 
performance impasse and was looking for a little advice.

The data source is a log file that grows at ~7-9k records/sec and 
consists of small text lines of < 512 bytes, newline delimited.

The primary symptom here is a steadily growing memory footprint even 
though the back end analyzer seems to be processing the events in near 
real time - i.e. there is obviously some buffering going on but the data 
is being consumed.  The footprint for script side variables is not to 
blame as it is always << 1% of the total.

I tried modifying Raw::block_size to better fit the line size, but that 
made it worse.  Increasing it to 16k seemed to be the sweet spot, but 
the problem is still there.

Any thoughts on what might help here (besides lower data rates)?


