Gentoo Archives: gentoo-dev

From: Rich Freeman <rich0@g.o>
To: Zac Medico <zmedico@g.o>
Cc: gentoo-dev <gentoo-dev@l.g.o>, binhost@g.o
Subject: Re: [gentoo-dev] New project: binhost
Date: Sun, 14 Feb 2021 13:30:57
Message-Id: CAGfcS_==dFEcSSc_3ZE78uH+GRmkSZVQHpgHp2BJmM1Hqc6v9A@mail.gmail.com
In Reply to: Re: [gentoo-dev] New project: binhost by Zac Medico
1 On Sat, Feb 13, 2021 at 8:51 PM Zac Medico <zmedico@g.o> wrote:
2 >
3 > > 2. Generate a hash of the file contents - this can go in the filename
4 > > so that the file can co-exist with other files, and be located
5 > > assuming you have a full matching set of metadata.
6 >
7 > For FEATURES=binpkg-multi-instance we currently use an integer BUILD_ID
8 > ensure that file names are unique.
9 >
10 > > 3. Start dropping attributes from the file based on a list of
11 > > priorities and generate additional hashes. Create symlinked files to
12 > > the original file using these hashes (overwriting or not existing
13 > > symlinks based on policy). This allows the binary package to be found
14 > > using either an exact set of attributes or a subset of higher-priority
15 > > attributes. This is analogous to shared object symlinking.
16 > > 4. The package manager will look for a binary package first using the
17 > > user's full config, and then by dropping optional elements of the
18 > > config (so maybe it does the search without CFLAGs, then without USE
19 > > flags). Eventually it aborts based on user prefs (maybe the user only
20 > > wants an exact match, or is willing to accept alternate CFLAGs but not
21 > > USE flags, or maybe anything for the arch is selected> 5. As always the final selected binary package still gets evaluated
22 > > like any other binary package to ensure it is usable.
23 > >
24 > > Such a system can identify whether a potentially usable file exists
25 > > using only filename, cutting down on fetching. In the interests of
26 > > avoiding useless fetches we would only carry step 3 reasonably far -
27 > > packages would have to match based on architecture and any dynamic
28 > > linking requirements. So we wouldn't generate hashes that didn't
29 > > include at least those minimums, and the package manager wouldn't
30 > > search for them.
31 > >
32 > > Obviously you could do more (if you have 5 combinations of use flags,
33 > > look for the set that matches most closely). That couldn't be done
34 > > using hashes alone in an efficient way. You could have a small
35 > > manifest file alongside the binary package that could be fetched
36 > > separately if the package manager wants to narrow things down and
37 > > fetch a few of those to narrow it down further.
38 >
39 > All of the above is oriented toward multi-profile binhosts, so we'll
40 > have to do a cost/benefit analysis to determine whether it's worth the
41 > effort to introduce the complexity that multi-profile binhosts add.
42
43 The hash label on the filenames was also considered around
44 multi-profiles. I figured that if you're going to be building
45 variants of packages you'd want to parallelize and hashes work better
46 for that. Plus at least in concept you could potentially identify and
47 fetch files by hash using info already in the local repo without
48 having to sync additional metadata from the binhost. User-contributed
49 binaries would also work better in such a world though for obvious
50 security issues that might just take the form of local user-generated
51 repos (allowing users to supplement the upstream repo with local
52 builds for a cluster, without having to mirror/reporoduce the entire
53 upstream.
54
55 I do get that multi-profiles aren't entirely an essential feature, but
56 when you consider stuff like X11 support or stable/unstable it seems
57 like we're probably going to have to provide at least a few variants
58 on packages for this to be practical. You could just put each profile
59 in a separate repo, but then anything that doesn't actually change
60 across profiles gets built multiple times. The hash-based solution is
61 also a form of deduping.
62
63 But, hey, it is great to see anything like this being done at all.
64 Walking before running isn't a bad thing!
65
66 --
67 Rich