[Bro-Dev] Package manager meta data

Siwek, Jon jsiwek at illinois.edu
Sat Oct 29 10:01:31 PDT 2016


> On Oct 28, 2016, at 5:52 PM, Jan Grashöfer <jan.grashoefer at gmail.com> wrote:
> 
> Correct me if I am wrong
> but bro-pkg.meta contains stuff like script_dir and dependencies (so
> rather technically), whereas bro-pkg.index contains the descriptive
> information like info text and tags (which is metadata, too, one could
> even argue it's "more meta" than script_dir etc.).

That’s right.  The way I was thinking about how it’s split up is: if the metadata is related to how users will search for and discover new packages, then put it bro-pkg.index.  Else it’s likely related to how the package will interoperate with bro, bro-pkg, other packages, etc., and that goes in bro-pkg.meta.

> I think the most desirable solution would be to have a
> single file to put the meta data in, so that a package is completely
> self-describing. This would also allow to provide different descriptions
> for different versions.

Yes, I also think each package maintaining just it’s own, single metadata file is better.  It also means that if the package author ever registered their package with multiple sources, they don’t have to maintain the same bro-pkg.index in multiple places.

I don’t remember if we just settled on the current implementation because it was quick/easy or there were objections to other more complicated technical solutions.

> Regarding the technical solution, I'll try to sum up: Using a
> distributed structure implies that important information is distributed,
> too. I think the first question is, where to aggregate the information?
> One could either maintain a cache in every client or integrate it into
> the list of packages aka the public repository

Aggregating it into the package source is a better solution than having every client do it.  The later isn’t going to scale well:  the client will take longer and longer over time as more and more packages get registered to a source.  Also takes longer as a function of total number of release versions a package has because we are collecting metadata for each version.  Rather not ask users to just get used to developing more patience over time.

> The second question would be, whether and how to synchronize the
> information? If the info is part of the repository this can be either
> done manually (more or less the overriding solution of the current
> implementation, assuming that the developers keep meta data in sync) or
> automatically (e.g., by a script that fetches meta data of packages once
> a day).

I’d opt for a daily cron job to aggregate metadata into package sources.

> If the cache is part of the client, this could be done based on
> an expiration threshold or intentionally by the user (similar to dnf).
> Finally one could drop the requirement of synced package and repository
> meta data, risking to confuse the users. In that case the information
> contained in the package should be used whenever possible (e.g., the
> info command for a not installed package could obtain the most recent
> information from the package's git repo).

It’s not a problem for the metadata to be out of sync for a day since only the “search” command is going to be using the aggregated data.  Other commands would have direct access to accurate metadata since they’ve already cloned the package locally.

It would also be trivial to give users access to the aggregation tool if they have a problem with potentially using day-old metadata in their searches and are prepared to wait however long the aggregation process takes.

E.g. we add this command/flag: `bro-pkg refresh —aggregate-metadata`

Then the only difference between the daily aggregation process and a user is that the daily process does a `git commit && git push` in the locally cloned package source that bro-pkg is using internally.

> Another question: Now that repositories only contain bro-pkg.index files
> with links instead of submodules, how are deleted/unavailable packages
> detected/removed?

At the moment, they’d have to be removed manually whenever someone notices or reports it.

If we switch to automated metadata aggregation, removal of nonexistent packages could naturally be a part of that.

- Jon



More information about the bro-dev mailing list