Gentoo Archives: gentoo-dev

From: Aisha Tammy <gentoo.dev@×××××.cc>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] New project: binhost
Date: Wed, 10 Feb 2021 19:16:21
Message-Id: e8888b0f-4884-e4e1-e00a-9f8e02b0f33e@aisha.cc
In Reply to: Re: [gentoo-dev] New project: binhost by Rich Freeman
1 On 2/10/21 2:11 PM, Rich Freeman wrote:
2 > On Wed, Feb 10, 2021 at 12:57 PM Andreas K. Hüttel <dilfridge@g.o> wrote:
3 >> * what portage features are still needed or need improvements (e.g. binpkg
4 >> signing and verification)
5 >> * how should hosting look like
6 > Some ideas for portage enhancements:
7 >
8 > 1. Ability to fetch binary packages from some kind of repo.
9 > 2. Ability to have multiple binary packages co-exist in a repo (local
10 > or remote) with different build attributes (arch, USE, CFLAGS,
11 > DEPENDS, whatever).
12 > 3. Ability to pick the most appropriate binary packages to use based
13 > on user preferences (with a mix of hard and soft preferences).
14 >
15 > One idea I've had around how #2-3 might be implemented is:
16 > 1. Binary packages already contain data on how they were built (USE
17 > flags, dependencies, etc). Place this in a file using a deterministic
18 > sorting/etc order so that two builds with the same settings will have
19 > the same results.
20 this is provided by FEATURES="binpkg-multi-instance"
21 (or maybe i misread)
22 > 2. Generate a hash of the file contents - this can go in the filename
23 > so that the file can co-exist with other files, and be located
24 > assuming you have a full matching set of metadata.
25 > 3. Start dropping attributes from the file based on a list of
26 > priorities and generate additional hashes. Create symlinked files to
27 > the original file using these hashes (overwriting or not existing
28 > symlinks based on policy). This allows the binary package to be found
29 > using either an exact set of attributes or a subset of higher-priority
30 > attributes. This is analogous to shared object symlinking.
31 > 4. The package manager will look for a binary package first using the
32 > user's full config, and then by dropping optional elements of the
33 > config (so maybe it does the search without CFLAGs, then without USE
34 > flags). Eventually it aborts based on user prefs (maybe the user only
35 > wants an exact match, or is willing to accept alternate CFLAGs but not
36 > USE flags, or maybe anything for the arch is selected.
37 > 5. As always the final selected binary package still gets evaluated
38 > like any other binary package to ensure it is usable.
39 >
40 > Such a system can identify whether a potentially usable file exists
41 > using only filename, cutting down on fetching. In the interests of
42 > avoiding useless fetches we would only carry step 3 reasonably far -
43 > packages would have to match based on architecture and any dynamic
44 > linking requirements. So we wouldn't generate hashes that didn't
45 > include at least those minimums, and the package manager wouldn't
46 > search for them.
47 >
48 > Obviously you could do more (if you have 5 combinations of use flags,
49 > look for the set that matches most closely). That couldn't be done
50 > using hashes alone in an efficient way. You could have a small
51 > manifest file alongside the binary package that could be fetched
52 > separately if the package manager wants to narrow things down and
53 > fetch a few of those to narrow it down further.
54 >
55 > Or you could skip the hash searching and just fetch all the manifests
56 > for a particular package/arch and just search all of those, but that
57 > is more data to transfer just to do a query. A metadata cache of some
58 > kind of might be another solution. Content hashes would probably
59 > still be useful just to allow co-existence of alternate builds.
60 >