1 |
Hi, |
2 |
|
3 |
I've been working on adding a go based ebuild to Gentoo yesterday and I |
4 |
got this warning form portage saying that EGO_SUM is deprecated and |
5 |
should be avoided. Since I remember there was an intense discussion |
6 |
about this on the ML I went back and have re-read the threads before |
7 |
writing this piece. I'd like to provide my perspective as user, a |
8 |
proxied maintainer, and overlay owner. I also run a private mirror on my |
9 |
LAN to serve my hosts in order to reduce load on external mirrors. |
10 |
|
11 |
Before diving in I think it's worth reading mgorny's blog post "The |
12 |
modern packager’s security nightmare"[1] as it's relevant to the |
13 |
discussion, and something I deeply agree with. |
14 |
|
15 |
With all that being said, I feel that the tarball idea is a bad due to |
16 |
many reasons. |
17 |
|
18 |
From security point of view, I understand that we still have to trust |
19 |
maintainers not to do funky stuff, but I think this issue goes beyond |
20 |
that. |
21 |
|
22 |
First of all one of the advantages of Gentoo is that it gets it's source |
23 |
code from upstream (yes, I'm aware of mirrors acting as a cache layer), |
24 |
which means that poisoning source code needs to be done at upstream |
25 |
level (effectively means hacking GitHub, PyPi, or some standalone |
26 |
project's Gitea/cgit/gitlab/etc. instance or similar), sources which |
27 |
either have more scrutiny or have a limited blast radius. |
28 |
|
29 |
Additionally if an upstream dependency has a security issue it's easier |
30 |
to scan all EGO_SUM content and find packages that potentially depend on |
31 |
a broken dependency and force a re-pinning and rebuild. The tarball |
32 |
magic hides this completely and makes searching very expensive. |
33 |
|
34 |
In fact using these vendor tarballs is the equivalent of "static |
35 |
linking" in the packaging space. Why are we introducing the same issue |
36 |
in the repository space? This kills the reusability of already |
37 |
downloaded dependencies and bloats storage requirements. This is |
38 |
especially bad on laptops, where SSD free space might be limited, in |
39 |
case the user does not nuke their distfiles after each upgrade. |
40 |
|
41 |
Considering that BTRFS (and possibly other filesystems) support on the |
42 |
fly compression the physical cost of a few inflated ebuilds and |
43 |
Manifests is actually way smaller than the logical size would indicate. |
44 |
Compare that to the huge incompressible tarballs that now we need to |
45 |
store. |
46 |
|
47 |
As a proxied maintainer or overlay owner hosting these huge tarballs |
48 |
also becomes problem (i.e. we need some public space with potentially |
49 |
gigabytes of free space and enough bandwidth to push that to users). |
50 |
Pushing toward vendor tarballs creates an extra expense on every level |
51 |
(Gentoo infra, mirrors, proxy maintainers, overlay owners, users). |
52 |
|
53 |
If bloating portage is a big issue and we frown upon go stuff anyway (or |
54 |
only a few users need these packages), why not consider moving all go |
55 |
packages into an officially supported go packages only overlay? I |
56 |
understand that this would not solve the kernel buffer issue where we |
57 |
run out of environment variable space, but it would debloat the main |
58 |
portage tree. |
59 |
|
60 |
It also breaks reproducibility. With EGO_SUM I can check out an older |
61 |
version of portage tree (well to some extent) and rebuild packages since |
62 |
dependency upstream is very likely to host old versions of their source. |
63 |
With the tarballs this breaks since as soon as an ebuild is dropped from |
64 |
mainline portage the vendor tarballs follow them too. There is no way |
65 |
for the user to roll back a package a few weeks back (e.g. if new |
66 |
version has bugs), unlike with EGO_SUM. |
67 |
|
68 |
In fact I feel this goes against the spirit of portage too, since now |
69 |
instead of "just describing" how to obtain sources and build them, now |
70 |
it now depends on essentially ephemeral blobs, which happens to be |
71 |
externalized from the portage tree itself. I'm aware that we have |
72 |
ebuilds that pull in patches and other stuff from dev space already, but |
73 |
we shouldn't make this even worse. |
74 |
|
75 |
Finally with EGO_SUM we had a nice tool get-ego-vendor which produced |
76 |
the EGO_SUM for maintainers which has made maintenance easier. However I |
77 |
haven't found any new guidance yet on how to maintain go packages with |
78 |
the new tarball method (e.g. what needs to go into the vendor tarball, |
79 |
what changes are needed in ebuilds). Overall this complifates further |
80 |
ebuild development and verification of PRs. |
81 |
|
82 |
In summary, IMHO the EGO_SUM way of handling of go packages has more |
83 |
benefits than drawbacks compared to the vendor tarballs. |
84 |
|
85 |
Cheers, |
86 |
Zoltan |
87 |
|
88 |
[1] |
89 |
https://blogs.gentoo.org/mgorny/2021/02/19/the-modern-packagers-security-nightmare/ |