<div dir="ltr"><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">I got the following idea while perusing non_cluster.bro SumStats::process_epoch_result</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">i=1;</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">while (i &lt;= 1000 &amp;&amp; |bar| &gt; 0)</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">    {</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">    for (foo in bar)</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">        {</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">        break;</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">        }</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">    ...</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">    process bar[foo]</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">    ...</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">    optional: baz[foo] = bar[foo] #If we need to preserve original data<br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">    delete bar[foo];<br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">    ++i;</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">    }</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">This will allow iteration thru the table as I originally desired, although destroying the original table.<br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">SumStats::process_epoch_result deletes the current item inside the for loop, so is relying on undefined behavior, per the documentation: &quot;Currently, modifying a container’s membership while iterating over it may
result in undefined behavior, so do not add or remove elements
inside the loop.&quot;  The above example avoids that.  Does anyone use sumstats outside of a cluster context?</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Jan 5, 2018 at 6:04 PM, Jim Mellander <span dir="ltr">&lt;<a href="mailto:jmellander@lbl.gov" target="_blank">jmellander@lbl.gov</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Thanks, Jon:</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">I&#39;ve decided to split the data (a table of IP addresses with statistics captured over a time period) based on a modulo calculation against the IP address (the important characteristic being that it can be done on the fly without an additional pass thru the table), which with an average distribution of traffic gives relatively equal size buckets, each of which can be processed during a single event, as I described.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">I like the idea of co-routines - it would help to address issues like these in a more natural manner.</div><span class="HOEnZb"><font color="#888888"><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Jim</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div></font></span></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Jan 5, 2018 at 5:28 PM, Jon Siwek <span dir="ltr">&lt;<a href="mailto:jsiwek@corelight.com" target="_blank">jsiwek@corelight.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>On Fri, Jan 5, 2018 at 2:19 PM, Jim Mellander &lt;<a href="mailto:jmellander@lbl.gov" target="_blank">jmellander@lbl.gov</a>&gt; wrote:<br>
<br>
&gt; I haven&#39;t checked whether my desired behavior works, but since its not<br>
&gt; documented, I wouldn&#39;t want to rely on it in any event.<br>
<br>
</span>Yeah, I doubt the example you gave currently works -- it would just<br>
change the local value in the frame without modifying the internal<br>
iterator.<br>
<span><br>
&gt; I would be interested in hearing comments or suggestions on this issue.<br>
<br>
</span>What you want, the ability to split the processing of large data<br>
tables/sets over time, makes sense.  I&#39;ve probably also run into at<br>
least a couple cases where I&#39;ve been concerned about how long it would<br>
take to iterate over a set/table and process all keys in one go.  The<br>
approach that comes to mind for doing that would be adding coroutines.<br>
Robin has some ongoing work with adding better support for async<br>
function calls, and I wonder if the way that&#39;s done would make it<br>
pretty simple to add general coroutine support as well.  E.g. stuff<br>
could look like:<br>
<br>
event process_stuff()<br>
    {<br>
    local num_processed = 0;<br>
<br>
    for ( local item in foo )<br>
        {<br>
        process_item(item);<br>
<br>
        if ( ++num_processed % 1000 == 0 )<br>
            yield;  # resume next time events get drained (e.g. next packet)<br>
        }<br>
<br>
There could also be other types of yield instructions, like &quot;yield 1<br>
second&quot; or &quot;yield wait_for_my_signal()&quot; which would, respectively,<br>
resume after arbitrary amount of time or a custom function says it<br>
should.<br>
<span class="m_8372942921974891960HOEnZb"><font color="#888888"><br>
- Jon<br>
</font></span></blockquote></div><br></div>
</div></div></blockquote></div><br></div>