[Zeek-Dev] support for event handlers using a subset of parameters

Jon Siwek jsiwek at corelight.com
Wed Feb 20 09:56:20 PST 2019


On Tue, Feb 19, 2019 at 1:27 PM Vern Paxson <vern at corelight.com> wrote:

> > Which particular user errors were a concern?
>
> The main one being when there's an API change that's *not* backward
> compatible, and the user doesn't update their scripts to be in conformance
> with it as is required.  Clearly something we'll in general strive to avoid.

That's more of a developer error than a user error -- we can make
changes that are not backward compatible regardless of what mechanism
we go with, and if we do, that's on us (and we do want to always try
to make changes that are backward compatible for at least one release
cycle).

> > What about changing
> > semantics of a parameter though?
>
> That's indeed trickier.  Here's a thought for the example you provide:
>
>         event foo%(is_server: bool%) &deprecated;
>         event foo%(is_client: bool%);
>
> This would mean "if the handler is defined using the name is_server as the
> parameter, that's the deprecated version".  Any other declaration (that's
> for a parameter of type bool, of course) would refer to the new version.

Now we are back to matching parameters on name, so again feels more
natural to me to instead just support that generally in the first
place.

Take another concrete example we've done in the past: change parameter order.

    event foo(a: bool, b: string);

Changes to:

    event foo(b: string, a: bool);

If we generally support matching params by name, then all existing
event handlers continue working, no user action needed.  Otherwise we
need to &deprecate and force users to do work to update their code.

> By the way, I didn't follow whether in Vlad's example, the semantics of
> the parameter changed, or if his fix was just to give it the correct name
> to match its actual semantics.  For the latter, existing handlers are
> presumably flat-out broken, and there's no benefit in trying to continue
> to support them.

In Vlad's example, the fix was to make "is_server" actually mean "is
server", but people already wrote code based on the real meaning being
"is client", so we broke code.  (It's a useful example both to learn
from and consider how we might do things differently if we had better
deprecation procedures/mechanisms).

> One concern I have with leaning on is-the-name-a-match is not-too-implausible
> user coding along the lines of:
>
>         event foo(is_server_orig: bool)
>                 {
>                 ...
>                 local is_server = my_twisty_logic(is_server_orig, x, y, z);
>                 ...
>                 }
>
> i.e., the parameter being renamed by the user anyway because they want to
> use the original name for a local variable.  These users will be bitten
> by changes in parameter semantics regardless of which approach we use;
> name-based matching isn't a panacea.

They wouldn't be bitten if we just don't ever change parameter
semantics like that, but use a different deprecation process like I
mentioned.  E.g. we start with:

    event foo(is_server: bool);

They use:

    event foo(is_server_orig: bool);

We change to:

    event foo(is_server: bool &deprecated, is_client: bool);

Now they get a deprecation warning because we still fall back to
matching params by order+type if there's no valid name mappings.

The case where they *could* get silently bitten by changes in
parameter semantics is if we are actually swapping parameter order,
but they happen to swap params of the same type and the user defined
custom parameter names.

The fix there is either we just don't ever do such ambiguous parameter
order swapping or we say that the user is opting into that potential
pain for the purely cosmetic reason of custom param names.  (Or we
rely purely on name-based param matching without falling back to
traditional order+type matching.  I added the fallback just because it
seems generally worthwhile to still support that if people *really*
want that, or if they are using custom names already).

> > The other problem is that if a user is skipping a version, they may
> > end up with a handler that treats "is_server" in the original way, but
> > the meaning has been changed and we only match events based on type +
> > number of parameters.  With name-based parameter matching, we can
> > catch this scenario as well.
>
> With the tweak I outline above, this would only bite them once the &deprecated
> version is removed.  That could span several releases, depending on the
> particular situation.  I don't have a lot of sympathy for users who upgrade
> across a number of releases, for which the release notes flag backward
> compatibility issues, and they don't take heed to address those.

I don't sympathize much either, but still though -- if we've got a
solution that additionally makes this a non-issue, then that's a Good.

> I find an approach that changes the fundamental type-checking used for
> event handlers to be more heavy-handed than one that provides control to
> the developer to explicitly decide when they've made a change for which
> it makes sense to support a deprecated version.

There's two points here that I don't think were necessarily related,
so taking them separately:

* My proposed patch has a more fundamental change to how event handler
parameters get type-checked.

True, but not sure that is implicitly a Bad.  It's maybe a Good if it
fits the requirements better (and I still think it does :))

* The two proposals (yours vs. mine) have a difference in the amount
of control they grant a developer.

Need some clarification here.  In my proposal, the dev has just as
much control as they always had, plus they get to choose how to
deprecate individual event parameters.  So that's maybe more control
than deprecating entire event signatures?

Or what specifically was the difference in developer control that's a concern?

Generally I was thinking the mission statement should first emphasize
"don't annoy users", then "give developers enough options to make
decisions that will annoy users the least".

> If Zeek had originally used name-based parameters, then I'd be fine with
> this being the solution.  However, switching from positional to name-based
> strikes me as a conceptually deep change for a relatively modest problem.

Slight disagreement on scale of problem -- having solid deprecation
procedures/mechanisms is something that's desirable for any software
expecting a large user-base.  Optimizing for least time-wastage and
deprecation periods that give users flexibility in choosing when to
address problems is a Good.


- Jon


More information about the zeek-dev mailing list