1 |
On Wed, Apr 1, 2020 at 5:14 AM Samuel Bernardo < |
2 |
samuelbernardo.mail@×××××.com> wrote: |
3 |
|
4 |
> Hi Robin, |
5 |
> On 4/1/20 6:36 AM, Robin H. Johnson wrote: |
6 |
> |
7 |
> Normally we don't bundle dependencies, avoiding that problem entirely. |
8 |
> The Go eclasses however are badly designed, committed against protest by |
9 |
> paid corporate interests, and serve only to facilitate large-scale |
10 |
> copyright infringement and security vulnerabilities. If you're looking |
11 |
> for a consistent explanation of how they're supposed to work with the |
12 |
> rest of Gentoo, you won't find one. |
13 |
> |
14 |
> mjo: Can you please substantiate your claims? |
15 |
> |
16 |
> It would have been nice to have heard your concerns during February, any |
17 |
> of one the three times that William and I posted the go-module.eclass |
18 |
> EGO_SUM development work for review on this mailing list. I don't see a |
19 |
> single email from you during that entire period. |
20 |
> |
21 |
> The EGO_SUM support explicitly ensured that upstream distfiles (for each |
22 |
> dependency) remained absolutely as upstream provided them, without |
23 |
> merging the distfiles together or altering their content in way (I admit |
24 |
> that the exact naming of the distfiles changed, because it was terrible, |
25 |
> v0.0.0-20190311183353-d8887717615a.zip for example). |
26 |
> |
27 |
> Forgive my noobishness in this matter that let Alec to comment over my own |
28 |
> statement. |
29 |
> |
30 |
> Alec pointed out some very important issues in go development that break |
31 |
> copyright infringement and security vulnerabilities, but I'm sure that is |
32 |
> not related to the good work done in go-module.eclass to surpass all go |
33 |
> mess. npm is worst and I take from go-module as a good pattern to apply |
34 |
> also into there. |
35 |
> |
36 |
I am antarus, not mjo (but more on that below!) I don't believe bundling |
37 |
presents many challenges with regards to copyright infringement. As a |
38 |
package maintainer you should know the licenses used in your packages. You |
39 |
are required to reflect any licenses used in the LICENSE ebuild variable. |
40 |
Obviously this becomes more work if you are using a bundle due to the fact |
41 |
that bundling will include more code. In the golang ecosystem there is a |
42 |
tool to help maintainers do this ( |
43 |
https://packages.gentoo.org/packages/dev-go/golicense). I get that with |
44 |
bundling we cannot share the work from previous packages because packages |
45 |
are not shared in a bundled environment but I expect the golicense tool to |
46 |
have good coverage in practice. If the tool does the work, sharing the work |
47 |
becomes moot. |
48 |
|
49 |
I think licensing can be more challenging in other bundling scenarios where |
50 |
tooling is not provided; but note that this is not significantly different |
51 |
from the unbundled scenario in terms of license discovery. If I am |
52 |
packaging a new program (A) and it depends on (B,C,D) I have two options. I |
53 |
can either package [A,B,C,D] (normal gentoo way) or I can package [A] (with |
54 |
B,C,D bundled). The intersection of the LICENSE variables is the same |
55 |
effort for both here. The benefit of the multiple packages is that future |
56 |
users of B,C,D can re-use the license discovery work and that isn't nothing. |
57 |
|
58 |
> Going back to my overlay use case, will go-modules download all modules to |
59 |
> distfiles directory? The naming convention will assure that there will be |
60 |
> no modules repetition? |
61 |
> |
62 |
What about eclean-dist, will it work as expected for those modules |
63 |
> dependencies? |
64 |
> |
65 |
> I think some of this answers would worth mention in documentation. |
66 |
> |
67 |
> Sorry for anything I wrongly stated and thank you very much for your help, |
68 |
> |
69 |
> Samuel |
70 |
> |
71 |
I've chosen this part to write my treatise on packaging, but rest assured |
72 |
it's mostly intended as a response to mgorny and mjo; not specifically in |
73 |
response to you. |
74 |
|
75 |
The very long answer is that Gentoo was designed around a paradigm of |
76 |
programs written primarily in C. In C programs you have the ability to link |
77 |
to libraries which offer APIs and in the ideal case, each API is offered |
78 |
via a unique SONAME[0]. Upstream packages were written and built in this |
79 |
way (with dynamic linking). So in the case of package A, that uses |
80 |
libraries B, C, and D; the result in many distributions is 4 packages |
81 |
(A,B,C,D) and users who want A will get B, C, and D installed. This in fact |
82 |
was a major selling point of package managers at the time because finding |
83 |
these dependencies by hand and building and merging them all was painful. |
84 |
|
85 |
Many applications break this trend; I don't think golang or nodejs are |
86 |
particularly new (python and ruby have had (pip, venv) and rubygems[1] for |
87 |
years, for example, which are similar bundling paradigms.) The struggle as |
88 |
packagers and distribution managers is when upstream decides "my software |
89 |
should be installed via a bundling solution (golang, node, pip, rubygems, |
90 |
and so on)" we are left to decide both whether to map this to the ebuild |
91 |
paradigm (no bundling of dependencies) or omit ebuilds entirely. In the |
92 |
former case we are often left working at odds with upstream (who are |
93 |
confused by our decomposition of their application) and in the latter case, |
94 |
users often use the bundle anyway (e.g. they install the packages by hand |
95 |
or use the ruby gems or whatever.) I assert this is somewhat of a false |
96 |
choice. Bundling isn't all bad and we can learn from past mistakes[2] to |
97 |
try to avoid problems. |
98 |
|
99 |
Another challenge with bundling is that often bundling systems (bundler, |
100 |
pip, venv, golang, etc.) specify specific versions, commits, or tags. This |
101 |
is fine when bundling (because each bundle has its own version of a |
102 |
dependency in the bundle) but when you are trying to share a system wide |
103 |
package between N packages, you either need to SLOT the dependencies or |
104 |
have a looser dependency specification. The fine-grained nature of the |
105 |
upstream dependency specification can make this challenging[3]. |
106 |
|
107 |
Unbundling then made it easier for system operators to operate a system; |
108 |
and you see this often in the security space. A security notice will come |
109 |
out saying "foo-X-Y-Z is vulnerable, move to foo-Y.1." So operators want to |
110 |
know "do I have foo-x-y-z installed?" When every package is in the package |
111 |
manager this is a trivial question. When software is bundled inside of a |
112 |
package, this visibility is lost. I haven't seen any tooling for Gentoo to |
113 |
this problem. |
114 |
|
115 |
In addition to the above, bundling can present exciting resource challenges |
116 |
for some deployments. Imagine a common dep (CommonFoo-x-y-z) has a security |
117 |
problem, so we must upgrade to CommonFoo-y-z. In the scenario where |
118 |
CommonFoo is a dynamically linked package we can recompile it once[4] and |
119 |
new consumers will just use the new dynamic shared object. In a bundling |
120 |
scenario, we will be forced to rebuild[5] all consumers. This can take a |
121 |
lot of time and resources depending on the deployment. Is the deployment |
122 |
using a build farm? A binary packages host? How many disparate platforms |
123 |
are in use? |
124 |
|
125 |
Which is to say many people in Gentoo dislike bundling for various reasons; |
126 |
many of them legitimate. I wish to present a narrative where bundling is an |
127 |
engineering trade-off, rather than a decision that is settled engineering |
128 |
law. This doesn't mean Gentoo needs to support all the bundling (clearly |
129 |
most people don't want to) but not supporting it means that many packages |
130 |
will not be in Gentoo at all (because unbundling is too costly) and so you |
131 |
end up at this exciting discussion which happens every couple of years. |
132 |
|
133 |
-A |
134 |
|
135 |
[0] I understand this is not always true in practice, but let's assume |
136 |
spherical cows momentarily. |
137 |
[1] Gentoo has a rubygems-fakegem eclass that makes it pretty streamlined |
138 |
to make an ebuild for a particular gem, but of course if my application |
139 |
depends on 20 gems I still need to make 20 ebuilds in this scheme and merge |
140 |
them all. Rubygems-fakegem is still pretty good though! |
141 |
[2] On windows https://en.wikipedia.org/wiki/DLL_Hell was common. In the |
142 |
.NET ecosystems assemblies addressed some of these problems. |
143 |
[3] Similar to DLL hell but more generically: |
144 |
https://en.wikipedia.org/wiki/Dependency_hell |
145 |
[4] Practice of course, leads to all kinds of weird edge cases where |
146 |
upgrading your shared lib causes dependencies to break for various reasons; |
147 |
which is one reason why application authors like to bundle; because their |
148 |
application ends up being perceived as more reliable and less finicky. |
149 |
[5] The number of package rebuilds in Gentoo is a fairly common complaint, |
150 |
from my personal observation. Obviously binary packages make this problem |
151 |
worse (not better.) I dunno if its something the community should put more |
152 |
effort into or not though; my expectation is that rebuilds are common and |
153 |
making them more common is not a strategic problem; but I'm also not |
154 |
compiling on some single core atom, so what do I know, eh? :) |