1 |
On 2/10/21 11:11 AM, Rich Freeman wrote: |
2 |
> On Wed, Feb 10, 2021 at 12:57 PM Andreas K. Hüttel <dilfridge@g.o> wrote: |
3 |
>> |
4 |
>> * what portage features are still needed or need improvements (e.g. binpkg |
5 |
>> signing and verification) |
6 |
>> * how should hosting look like |
7 |
> |
8 |
> Some ideas for portage enhancements: |
9 |
> |
10 |
> 1. Ability to fetch binary packages from some kind of repo. |
11 |
|
12 |
The old PORTAGE_BINHOST functionality has been replaced with a |
13 |
binrepos.conf file that's very similar to repos.conf: |
14 |
|
15 |
https://bugs.gentoo.org/668334 |
16 |
|
17 |
It doesn't have explicit support for multiple local binary package |
18 |
repositories yet, but somebody got it working with src-uri set to a |
19 |
file:/ uri as described in comments on this bug: |
20 |
|
21 |
https://bugs.gentoo.org/768957 |
22 |
|
23 |
> 2. Ability to have multiple binary packages co-exist in a repo (local |
24 |
> or remote) with different build attributes (arch, USE, CFLAGS, |
25 |
> DEPENDS, whatever). |
26 |
|
27 |
We can now enable FEATURES=binpkg-multi-instance by default now that |
28 |
this bug is fixed: |
29 |
|
30 |
https://bugs.gentoo.org/571126 |
31 |
|
32 |
> 3. Ability to pick the most appropriate binary packages to use based |
33 |
> on user preferences (with a mix of hard and soft preferences). |
34 |
|
35 |
Current package selection logic for binary packages is basically the |
36 |
same as for ebuilds. These are the notable differences: |
37 |
|
38 |
1) Binary packages are sorted in descending order by (version, mtime), |
39 |
so then most recent builds are preferred when the versions are identical. |
40 |
|
41 |
2) The --binpkg-respect-use option rejects binary packages what would |
42 |
need to be rebuilt in order to match local USE settings. |
43 |
|
44 |
> One idea I've had around how #2-3 might be implemented is: |
45 |
> 1. Binary packages already contain data on how they were built (USE |
46 |
> flags, dependencies, etc). Place this in a file using a deterministic |
47 |
> sorting/etc order so that two builds with the same settings will have |
48 |
> the same results. |
49 |
|
50 |
This would only be needed to multi-profile binhosts that provide a |
51 |
variety of configurations for the same package. |
52 |
|
53 |
Features like this are not necessary if the binhost only intends to |
54 |
provide packages for a single profile. |
55 |
|
56 |
> 2. Generate a hash of the file contents - this can go in the filename |
57 |
> so that the file can co-exist with other files, and be located |
58 |
> assuming you have a full matching set of metadata. |
59 |
|
60 |
For FEATURES=binpkg-multi-instance we currently use an integer BUILD_ID |
61 |
ensure that file names are unique. |
62 |
|
63 |
> 3. Start dropping attributes from the file based on a list of |
64 |
> priorities and generate additional hashes. Create symlinked files to |
65 |
> the original file using these hashes (overwriting or not existing |
66 |
> symlinks based on policy). This allows the binary package to be found |
67 |
> using either an exact set of attributes or a subset of higher-priority |
68 |
> attributes. This is analogous to shared object symlinking. |
69 |
> 4. The package manager will look for a binary package first using the |
70 |
> user's full config, and then by dropping optional elements of the |
71 |
> config (so maybe it does the search without CFLAGs, then without USE |
72 |
> flags). Eventually it aborts based on user prefs (maybe the user only |
73 |
> wants an exact match, or is willing to accept alternate CFLAGs but not |
74 |
> USE flags, or maybe anything for the arch is selected> 5. As always the final selected binary package still gets evaluated |
75 |
> like any other binary package to ensure it is usable. |
76 |
> |
77 |
> Such a system can identify whether a potentially usable file exists |
78 |
> using only filename, cutting down on fetching. In the interests of |
79 |
> avoiding useless fetches we would only carry step 3 reasonably far - |
80 |
> packages would have to match based on architecture and any dynamic |
81 |
> linking requirements. So we wouldn't generate hashes that didn't |
82 |
> include at least those minimums, and the package manager wouldn't |
83 |
> search for them. |
84 |
> |
85 |
> Obviously you could do more (if you have 5 combinations of use flags, |
86 |
> look for the set that matches most closely). That couldn't be done |
87 |
> using hashes alone in an efficient way. You could have a small |
88 |
> manifest file alongside the binary package that could be fetched |
89 |
> separately if the package manager wants to narrow things down and |
90 |
> fetch a few of those to narrow it down further. |
91 |
|
92 |
All of the above is oriented toward multi-profile binhosts, so we'll |
93 |
have to do a cost/benefit analysis to determine whether it's worth the |
94 |
effort to introduce the complexity that multi-profile binhosts add. |
95 |
|
96 |
> Or you could skip the hash searching and just fetch all the manifests |
97 |
> for a particular package/arch and just search all of those, but that |
98 |
> is more data to transfer just to do a query. A metadata cache of some |
99 |
> kind of might be another solution. Content hashes would probably |
100 |
> still be useful just to allow co-existence of alternate builds. |
101 |
|
102 |
This also relates to the centralized Packages file that's currently used |
103 |
to distribute the package metadata for all packages in a binhost. We can |
104 |
make it scale better if we split out a separate index per package, not |
105 |
unlike a pypi simple index: |
106 |
|
107 |
https://pypi.org/simple/ |
108 |
-- |
109 |
Thanks, |
110 |
Zac |