[Zeek-Dev] Hi + LL Analyzer

Jan Grashöfer jan.grashoefer at gmail.com
Wed Feb 27 07:07:36 PST 2019


On 26/02/2019 02:36, Robin Sommer wrote:
> I see three pieces here overall that I think can be tackled
> independently:
> 
> (1) Link-layer: Currently hardcoded in Packet::ProcessLayer2()
> 
> (2) IP-Layer: Currently hardcoded in NetSessions::NextPacket()
> 
> (3) Transport-layer: Currently hardcoded in NetSessions::DoNextPacket().

At first glance it looks like IP-layer multiplexing is done in 
NetSessions::{NextPacket, DoNextPacket} and the Transport-layer is 
tackled in Manager::BuildInitialAnalyzerTree in context of initializing 
a connection.

> Case (1) is all about skipping the header to get to IP. There's some
> redundancy across cases, though, and MPLS makes it all more messy.

One thing that comes to my mind here is whether it might be possible to 
pass information such as VLAN tags, MPLS labels or link layer addresses 
to upper layers in a generic way without hardcoding. However, that might 
be out of scope for now.

> With (2), a plugin would be able to add support for non-IP protocols.
> However, due to Bro generally assuming that it is analyzing IP, the
> plugin would either need to take care of such packets completely (like
> ARP does), or eventually get to an IP packet that it can then feed
> back for further analysis (like if it some kind of a tunnel).

The non-IP packet might also contain a Transport-layer PDU. I guess it 
should be possible to pass these on as well.

> There's also a more general version of (2) and (3) where we'd remove
> Bro's assumption of analyzing TCP/IP protocols. But that's a separate,
> large effort by itself.

That is the central point. So a first step would be to rely on TCP/IP in 
the "middle" of the stack but allow pluggable Link-layer protocols. 
Those might feed their data to the TCP/IP pipeline or handle them on 
their own. The next step would be the IP-layer.

> On a technical level, plugging in such low-level analyzers needs to be
> very efficient, in particular if we move the currently hardcoded cases
> into the plugins as well (as I think we should; similar to how
> application-layer analyzers have all moved into internal plugins).
> Then the lookup-the-analyzer-and-dispatch operation will happen
> multiple times for every packet.

One question here would be whether it makes sense to assume that the set 
of LL-analyzers tash should be available is known at compile-time?

>> - What about the concept of connections? For some LL protocols the
>> concept might be counterintuitive.
> 
> Couple cases there:
> 
> - If there's really no sense of a connection, then the plugin will
>    need to take complete care of the packets, as the rest of Bro
>    assumes connection-semantics.

Maybe there is another general abstraction that is worth to be supported 
as well. I was thinking of request-reply-pairs that can be correlated. 
However, I haven't put much thought into this, yet.

> - If it's just the definition of what defines a connection that is
>    different, then I think we could make that more flexible. I've been
>    hoping for a while that we can make Bro's notion of connection IDs
>    dynamic, so that it's not necessarily just the 5-tuple. There are
>    use cases outside of new protocols for this, too. For example, one
>    could include the VLAN ID to deal with overlapping IP ranges in
>    independent VLANs.

I think this would be part of the larger effort to re-think Zeek's 
notion of connections. This could be addressed together with 
implementing a flexible mechanism to make meta data like LL-addresses 
available in context of a connection.

>> - The interface should support to pass payload to other analyzers. Does
>> it make sense to come up with a generalized DPD-mechanism?
> 
> Not quite sure what you're thinking here, but I believe that fully
> solving this would require addressing Bro's overall assumption of
> analyzing TCP/IP. For now, maybe the best way would be just having the
> analyzer call back into entry points corresponding to the various
> layers where analysis would then proceed as normal. I.e., some
> variation of: ProcessLinkLayer(...), ProcessIP(...),
> ProcessTransport(data), ProcessAppLayer(...). The caller would be
> responsible for providing all the right (meta-)data, like IP headers.
> Were you thinking something different / more general?

While I haven't looked into it, I noticed that there are distinct PIA 
implementations for TCP and UDP. In case we allow to plug in new 
transport protocols, they might need their own PIA to support the 
analysis of known protocols like HTTP etc. However, if we keep a focus 
on TCP/IP as suggested that would be out of scope for now.

Jan


More information about the zeek-dev mailing list