Hi , I am very glad to write this email . I am a user of bro and recently I start to use BinPAC which is a subcomponent of bro . After learning the syntax of BinPAC , I wrote some
simple BinPAC programs and tested them . I got a problem during the test
and it really confused me . So I am writing to you and hope to get your
help . I will describe my problem below . <br>
<p>
        <br>
</p>
<p>
        My BinPAC version is 0.47 .
</p>
<p>
        <br>
</p>
<p>
        In the test I have two machines , A and B . One process on machine A
sends messages to another process on machine B once per second . the
message is in this format :
</p>
<p>
        la(uint32) + lb (uint32) + s(a random string
whose length is not fixed)
</p>
<p>
        In the message , "la" and "lb" are both the
length of the string "s" . For example , la = 26 , lb = 26 , s =
"abcdefghijklmnopqrstuvwxyz" . Another example , la = 10 , lb = 10 , s =
"helloworld" . So I wrote a BinPAC program (<span class="ke-content-forecolor" style="color:#E53333;">see file_1.pac in
attachment</span>) and it worked as expected . But when I made a small change
to the BinPAC program (<span class="ke-content-forecolor" style="color:#E53333;">see file_2.pac in attachment</span>), a bug existed .
</p>
<br>
In the first case , I defined a type "header" which contains "la :
unint32" and "lb : uint32" and I defined another type "body" which
only contains "s : bytestring" . And I defined a high-level type which
contains "header" and "body" . Then I print out "la" and the length of "s" . It showed that the program worked
properly , the output is like this :<br>
<br>
238 238<br>
309 309<br>
311 311<br>
339 339<br>
344 344<br>
252 252<br>
290 290<br>
312 312<br>
298 298<br>
300 300<br>
281 281<br>
...<br>
<br>
That is what I want . The first number in each line is "la" and the second number is the length of "s" .<br>
<br>
<p>
        But when I didn't define the "header" type but wrote "la : uint32 ; lb :
uint32" in the high-level type directly instead , it failed to work , I mean , nothing was printed out .
</p>
<p>
        <br>
</p>
<p>
        In "file_2.pac" , I wrote like this :
</p>
<p>
        type trans_pdu(is_orig: bool) = record {<br>
la: uint32 &byteorder=bigendian;<br>
lb: uint32 &byteorder=bigendian;<br>
body: trans_body(x) &requires(x);<br>
} &let{<br>
x = la;<br>
} &length = x + 8;
</p>
<br>
I read the file generated by binpac ("file_2.cc") <span class="ke-content-forecolor" style="color:#333333;"><em>, </em></span><span class="ke-content-forecolor" style="color:#333333;"><em></em></span><span class="ke-content-forecolor" style="color:#333333;"><em></em></span><span class="ke-content-forecolor" style="color:#333333;"><em></em></span><span class="ke-content-forecolor" style="color:#333333;"><em></em></span>and I added 2 lines into "file_2.cc" to debug it to see what would happen
(the additional codes just print out buffering state
, the address of "t_begin_of_data" , the address of "t_end_of_data" and
"throw") . The following is part of the code of "file_2.cc" . Only the two "printf" sentences are added by me . <br>
<br>
<em>bool trans_pdu::ParseBuffer(flow_buffer_t t_flow_buffer, Contexttrans * t_context)</em><br>
<em> {</em><br>
<em> bool t_val_parsing_complete;</em><br>
<em> t_val_parsing_complete = false;</em><br>
<em> const_byteptr t_begin_of_data = t_flow_buffer->begin();</em><br>
<em> const_byteptr t_end_of_data = t_flow_buffer->end();</em><br>
<em> </em><span class="ke-content-forecolor" style="color:#E53333;"><em>
printf("buffering state: %d t_begin_of_data: %d t_end_of_data: %d \n"
, buffering_state_ , (void*)t_begin_of_data , (void*)</em></span><span class="ke-content-forecolor" style="color:#E53333;"><em>t_end_of_data);</em></span><br>
<em> switch ( buffering_state_ )</em><br>
<em> {</em><br>
<em> <span class="ke-content-forecolor" style="color:#E53333;">case 0:</span></em><br>
<em> if ( buffering_state_ == 0 )</em><br>
<em> {</em><br>
<em> <span class="ke-content-forecolor" style="color:#E53333;">t_flow_buffer-></span><span class="ke-content-forecolor" style="color:#E53333;">NewFrame(4, false)</span>;</em><br>
<em> buffering_state_ = 1;</em><br>
<em> }</em><br>
<em> buffering_state_ = 1;</em><br>
<em> break;</em><br>
<em> <span class="ke-content-forecolor" style="color:#E53333;">case 1:</span></em><br>
<em> {</em><br>
<em> buffering_state_ = 2;</em><br>
<em> <span class="ke-content-forecolor" style="color:#E53333;"> // Checking out-of-bound for "trans_pdu:lb"</span></em><br>
<em> if ( (t_begin_of_data + 4) + (4) > t_end_of_data || (t_begin_of_data + 4) + (4) < (t_begin_of_data + 4) )</em><br>
<em> {</em><br>
<em> </em><span class="ke-content-forecolor" style="color:#E53333;"> <em>printf("throw</em><em>\n");</em></span><br>
<em> // Handle out-of-bound condition</em><br>
<em> throw binpac::ExceptionOutOfBound("trans_pdu:lb",</em><br>
<em> (4) + (4), </em><br>
<em> (t_end_of_data) - (t_begin_of_data));</em><br>
<em> }</em><br>
<em> // Parse "la"</em><br>
<em> la_ = FixByteOrder(bigendian, *((uint32 const *) (t_begin_of_data)));</em><br>
<em> // Evaluate 'let' and 'withinput' fields</em><br>
<em> x_ = la();</em><br>
<em> t_flow_buffer->GrowFrame(x() + 8);</em><br>
<em> }</em><br>
<em> break;</em><br>
<em> case 2:</em><br>
<em> BINPAC_ASSERT(t_flow_buffer->ready());</em><br>
<em> if ( t_flow_buffer->ready() )</em><br>
<em> {</em><br>
<em> </em><br>
<em> // Parse "lb"</em><br>
<em> lb_ = FixByteOrder(bigendian, *((uint32 const *) ((t_begin_of_data + 4))));</em><br>
<em> // Evaluate 'let' and 'withinput' fields</em><br>
<em> </em><br>
<em> // Parse "body"</em><br>
<em> body_ = new trans_body(x());</em><br>
<em> int t_body__size;</em><br>
<em> t_body__size = body_->Parse((t_begin_of_data + 8), t_end_of_data);</em><br>
<em> // Evaluate 'let' and 'withinput' fields</em><br>
<em> </em><br>
<em> t_val_parsing_complete = true;</em><br>
<em> if ( t_val_parsing_complete )</em><br>
<em> {</em><br>
<em> // Evaluate 'let' and 'withinput' fields</em><br>
<em> proc_ = t_context->flow()->proc_sample_message(this);</em><br>
<em> }</em><br>
<em> BINPAC_ASSERT(t_val_parsing_complete);</em><br>
<em> buffering_state_ = 0;</em><br>
<em> }</em><br>
<em> break;</em><br>
<em> default:</em><br>
<em> BINPAC_ASSERT(buffering_state_ <= 2);</em><br>
<em> break;</em><br>
<em> }</em><br>
<em> return t_val_parsing_complete;</em><br>
<p>
        <em> }</em>
</p>
<p>
        <em><br>
</em>
</p>
<p>
        <em>void trans_flow::NewData(const_byteptr t_begin_of_data, const_byteptr t_end_of_data)</em>
</p>
<p>
        <em> {</em>
</p>
<p>
        <em> ......</em>
</p>
<p>
        <em> ......<br>
</em>
</p>
<p>
         while ( ! t_dataunit_parsing_complete && flow_buffer_->ready() )<br>
{<br>
const_byteptr t_begin_of_data = flow_buffer()->begin();<br>
const_byteptr t_end_of_data = flow_buffer()->end();<br>
t_dataunit_parsing_complete = dataunit_-><span class="ke-content-forecolor" style="color:#E53333;">ParseBuffer</span>(flow_buffer(), context_);<br>
if ( t_dataunit_parsing_complete )<br>
{<br>
// Evaluate 'let' and 'withinput' fields<br>
}<br>
}<br>
<em></em>
</p>
<p>
        <em> ......</em>
</p>
<p>
        <em> ......<br>
</em>
</p>
<p>
        <em> }<br>
</em>
</p>
<br>
and the following is the output : <br>
<br>
buffering state: 0 t_begin_of_data: 44665856 t_end_of_data: 44665856 <br>
buffering state: 1 t_begin_of_data: 44665856 t_end_of_data: 44665860 <br>
throw<br>
buffering state: 0 t_begin_of_data: 44669328 t_end_of_data: 44669328 <br>
buffering state: 1 t_begin_of_data: 44669328 t_end_of_data: 44669332 <br>
throw<br>
buffering state: 0 t_begin_of_data: 44687568 t_end_of_data: 44687568 <br>
buffering state: 1 t_begin_of_data: 44687568 t_end_of_data: 44687572 <br>
throw<br>
buffering state: 0 t_begin_of_data: 44688176 t_end_of_data: 44688176 <br>
buffering state: 1 t_begin_of_data: 44688176 t_end_of_data: 44688180 <br>
throw<br>
...<br>
<p>
        <br>
</p>
<p>
        <br>
</p>
<p>
        I guess that in " case 0 " , the sentence " <span class="ke-content-forecolor" style="color:#E53333;">t_flow_buffer->NewFrame(4, false)</span> " is used to create a 4-byte frame to parse "la" , since "la" occupies
the first 4 bytes of the message and BinPAC need it evaluate the length
of the message ? After this operation , "t_begin_of_data " points to
the begin of the message and "t_end_of_data" points to the end of "la"
. But then " case 1 " will take place . "lb" will be checked whether
out-of-bound . "t_begin_of_data + 4" is the begin of "lb" , " 4 " is
the length of "lb" , but "t_end_of_data" still points to the end of la .
So <span class="ke-content-forecolor" style="color:#E53333;">"(t_begin_of_data + 4) + (4) >
t_end_of_data</span><span class="ke-content-forecolor" style="color:#E53333;">"</span>
was met , and the program didn't go ahead . However , that is only
the direct reason but not the source reason . I guess the source reason
is that I didn't write the BinPAC code in the correct way somewhere or
maybe there is a small bug in BinPAC .
</p>
<p>
        <br>
</p>
<p>
        I really really hope to use BinPAC to
deal with some protocol analysis . I tried to
read the source code of BinPAC but failed to understand . I don't know
what to do and I really want to get your help !
</p>
<p>
        Wish to get your reply !
</p>
<p>
        <br>
</p>
<p>
        Good luck !
</p>
<p>
        Yunchao Chen
</p>
<p>
        2018.2.27
</p>