1 |
On Tuesday, November 5, 2019 7:05 PM, Mickaël Bucas <mbucas@×××××.com> wrote: |
2 |
> Hi Caveman |
3 |
> |
4 |
> The Portage tree contains a few binary packages prepared by Gentoo |
5 |
> developers, like Firefox, Rust, LibreOffice... |
6 |
> "ls -d /usr/portage//-bin" shows about 90 packages prepared in this |
7 |
> way, some of them because they are non-free like Oracle JDK |
8 |
> |
9 |
> This means that there is no necessary changes to Gentoo to accomplish |
10 |
> what you describe : compile the packages, write the ebuilds for the |
11 |
> binary packages, publish ebuilds in an overlay. |
12 |
|
13 |
Some qt-related packages are really slow to compile, yet still not listed. |
14 |
A problem with this approach is that IMO it's too manual and doesn't react |
15 |
dynamically to user changes. |
16 |
|
17 |
IMO we can consider this an automated community-driven bin-host that uses |
18 |
statistics in order to tell which packages are reliable. In case of hardware |
19 |
mismatches, I think we can find a binary that's compiled with the desired, |
20 |
say, USE flags, but compiled on an older CPU model that's backward compatible |
21 |
with the newer rare one that one might be using. |
22 |
|
23 |
> But the really short list above shows that it's a really complex task |
24 |
> because of all dependencies and configurable elements in Gentoo. If |
25 |
> you just have a look at the output of "emerge --info" you can imagine |
26 |
> all the moving parts, like compiler versions and compile options, |
27 |
> Bash, Perl, Python, Init system, USE flags (combinatorial), even human |
28 |
> languages. And that is just the easily visible parts ! |
29 |
|
30 |
True, however a few points: |
31 |
|
32 |
* If we look at that info, from the perspective of individual packages, it is |
33 |
has much less degrees of variations in practice. E.g. if we look at the USE |
34 |
flags dimension, dev-qt/qtwebengine has 12 of them, so worst case for this |
35 |
aspect we get about: |
36 |
|
37 |
nchoosek(12,1) + nchoosek(12,2) + ... + nchoosek(12,12) = 4095 |
38 |
|
39 |
possible combinations with those 12 flags. But, |
40 |
most people are only interested in 2 sets of potential USE flag |
41 |
configurations, one with ALSA, or another with PA. So in practice, that 4095 |
42 |
is probably reduced to just 2 or 3 clusters of configurations (not 4095). |
43 |
|
44 |
* For hardware details, such as the exact CPU model and the kinds of features |
45 |
actually enabled by the compiler when using `-march=native`. I don't know |
46 |
the actual distribution of this in practice, but is it not possible |
47 |
that users can be given the choice to simply pick a binary that's compiled on |
48 |
an older backwards compatible CPU? |
49 |
|
50 |
E.g. the system could prompt the user the nearest (e.g. in selection of USE |
51 |
flags) to his query, by presenting the user with a binary compiled with an |
52 |
older x86-64 CPU model than his newer x86-64 CPU. |
53 |
|
54 |
This way, this could become simply an automated bin-host that blurs as |
55 |
necessary, and forks variations of specific configurations as demand raises, |
56 |
all without needing manual dev time to package *-bin manually. |
57 |
|
58 |
|
59 |
> I remember reading an article about a man trying to reproduce binary |
60 |
> packages of a binary distribution and failing to do so, because there |
61 |
> are so many parts involved. I've read later that distributions have |
62 |
> done some work to have reproducible builds, but I'm not sure how |
63 |
> successful they are, even when all choices are predefined. |
64 |
> |
65 |
> Given that Gentoo has taken a whole different road by having more |
66 |
> choices available to the user, I don't think the compilation results |
67 |
> of one configuration would be easily used on another. |
68 |
|
69 |
Is it possible to collect statistics of such configurations from Gentoo users? |
70 |
|
71 |
I don't know what would the outcome be, but I think it's worth exploring. E.g. |
72 |
what if it turned out that there is not much diversity in our |
73 |
settings? E.g. we can find a few really popular clusters of USE, langauge, |
74 |
license, flags? As for hardware, what would be the latest backwards compatible |
75 |
CPU that has compiled a binary for me with enough statistical confidence in its |
76 |
reliability? |
77 |
|
78 |
|
79 |
> To go even further, pushing your compiled packages to a public server |
80 |
> may create a security risk by exposing many parts of your |
81 |
> configuration that could be analyzed by malicious people. |
82 |
|
83 |
Any example of such sensitive information that might be in the binaries? Just |
84 |
curious, as I don't know much about this. |
85 |
|
86 |
I could be wrong, but so far my thought is that I don't think we get much bits |
87 |
of entropy for our security by hiding our package lists, because I think an |
88 |
adversary can probably already use statistics to predict common clusters of |
89 |
package lists that we might use.s. |
90 |
|
91 |
So I personally doubt that attackers would face much difficulty by not knowing |
92 |
our packages, because our packages are probably already predictable since our |
93 |
distribution of packages is not that diverse. |
94 |
|
95 |
|
96 |
> So far I don't see a really big advantage in building this kind of |
97 |
> infrastructure compared to either a binary distribution or Gentoo with |
98 |
> home compilation. |
99 |
|
100 |
IMO the real value is that it will be some kind of an automated community-driven |
101 |
bin-host that uses statistics to quantify the reliability of its bins, and to |
102 |
automatically create bins of special cases as the demand raises (e.g. common USE |
103 |
flag combinations that become trendy), without needing to wait for a package |
104 |
maintainer to bundle a *-bin package. |
105 |
|
106 |
I think, if this works, it may make Gentoo even better at binary packages than |
107 |
the mainly binary distros. |
108 |
|
109 |
|
110 |
rgrds, |
111 |
cm |