[Zeek-Dev] Proposal: Improve Zeek's log-writing system with batch support and better status reporting

Bob Murphy bob.murphy at corelight.com
Thu Jul 16 17:15:38 PDT 2020


>> For batching, I was thinking of having a way to send back a std::vector of structs that would be something like this:
>> 
>> struct failure_info {
>>    uint32_t index_in_batch;
>>    uint16_t failure_type;
>>    uint16_t recovery_suggestion;
>> };
> 
> This is almost starting to sound a bit more complicated than is worth it.  We may need to discuss this a bit more to figure out something simpler.  The immediate problem that springs to mind is that as a developer, I don't think I'd have any clue what failure_types and recovery_suggestions could be common among export destinations.

Seth and I were talking today, and came up with something like this:
struct failure_info {
    uint32_t first_index;
    uint16_t index_count;
    uint16_t failure_type;
};

Here’s how it would work:

1. The batch writing function would return a std::vector of these. If the entire batch wrote successfully, the vector would be empty.

2. The failure_type value would still indicate generally what happened, with predefined values indicating things like “network failure”, “protocol error”, “unable to write to disk”, or “unspecified failure". Seth thought we’d be likely to start out with about ten values like this. Using a 32-bit value for this provides lots of room for expansion :-) and maintain reasonable alignment within the struct.

3. first_index and index_count would specify a range. That way, if several successive log records aren’t sent for the same reason, that could be represented by a single struct, instead of a different struct for each one.

This drops the recovery suggestion.

The sizes of the struct fields are currently set to pack nicely into eight bytes, with no wasted space either within the struct or between structs in an array. We could make the fields different sizes, though.


More information about the Zeek-Dev mailing list