Gentoo Archives: gentoo-dev

From: Zoltan Puskas <zoltan@×××××××××.info>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] Proposal to undeprecate EGO_SUM
Date: Sun, 26 Jun 2022 23:43:34
Message-Id: 1a712a66f55e241ce6b6084eb19e1f34@sinustrom.info
In Reply to: [gentoo-dev] Proposal to undeprecate EGO_SUM by Florian Schmaus
1 Hi,
2
3 I've been working on adding a go based ebuild to Gentoo yesterday and I
4 got this warning form portage saying that EGO_SUM is deprecated and
5 should be avoided. Since I remember there was an intense discussion
6 about this on the ML I went back and have re-read the threads before
7 writing this piece. I'd like to provide my perspective as user, a
8 proxied maintainer, and overlay owner. I also run a private mirror on my
9 LAN to serve my hosts in order to reduce load on external mirrors.
10
11 Before diving in I think it's worth reading mgorny's blog post "The
12 modern packager’s security nightmare"[1] as it's relevant to the
13 discussion, and something I deeply agree with.
14
15 With all that being said, I feel that the tarball idea is a bad due to
16 many reasons.
17
18 From security point of view, I understand that we still have to trust
19 maintainers not to do funky stuff, but I think this issue goes beyond
20 that.
21
22 First of all one of the advantages of Gentoo is that it gets it's source
23 code from upstream (yes, I'm aware of mirrors acting as a cache layer),
24 which means that poisoning source code needs to be done at upstream
25 level (effectively means hacking GitHub, PyPi, or some standalone
26 project's Gitea/cgit/gitlab/etc. instance or similar), sources which
27 either have more scrutiny or have a limited blast radius.
28
29 Additionally if an upstream dependency has a security issue it's easier
30 to scan all EGO_SUM content and find packages that potentially depend on
31 a broken dependency and force a re-pinning and rebuild. The tarball
32 magic hides this completely and makes searching very expensive.
33
34 In fact using these vendor tarballs is the equivalent of "static
35 linking" in the packaging space. Why are we introducing the same issue
36 in the repository space? This kills the reusability of already
37 downloaded dependencies and bloats storage requirements. This is
38 especially bad on laptops, where SSD free space might be limited, in
39 case the user does not nuke their distfiles after each upgrade.
40
41 Considering that BTRFS (and possibly other filesystems) support on the
42 fly compression the physical cost of a few inflated ebuilds and
43 Manifests is actually way smaller than the logical size would indicate.
44 Compare that to the huge incompressible tarballs that now we need to
45 store.
46
47 As a proxied maintainer or overlay owner hosting these huge tarballs
48 also becomes problem (i.e. we need some public space with potentially
49 gigabytes of free space and enough bandwidth to push that to users).
50 Pushing toward vendor tarballs creates an extra expense on every level
51 (Gentoo infra, mirrors, proxy maintainers, overlay owners, users).
52
53 If bloating portage is a big issue and we frown upon go stuff anyway (or
54 only a few users need these packages), why not consider moving all go
55 packages into an officially supported go packages only overlay? I
56 understand that this would not solve the kernel buffer issue where we
57 run out of environment variable space, but it would debloat the main
58 portage tree.
59
60 It also breaks reproducibility. With EGO_SUM I can check out an older
61 version of portage tree (well to some extent) and rebuild packages since
62 dependency upstream is very likely to host old versions of their source.
63 With the tarballs this breaks since as soon as an ebuild is dropped from
64 mainline portage the vendor tarballs follow them too. There is no way
65 for the user to roll back a package a few weeks back (e.g. if new
66 version has bugs), unlike with EGO_SUM.
67
68 In fact I feel this goes against the spirit of portage too, since now
69 instead of "just describing" how to obtain sources and build them, now
70 it now depends on essentially ephemeral blobs, which happens to be
71 externalized from the portage tree itself. I'm aware that we have
72 ebuilds that pull in patches and other stuff from dev space already, but
73 we shouldn't make this even worse.
74
75 Finally with EGO_SUM we had a nice tool get-ego-vendor which produced
76 the EGO_SUM for maintainers which has made maintenance easier. However I
77 haven't found any new guidance yet on how to maintain go packages with
78 the new tarball method (e.g. what needs to go into the vendor tarball,
79 what changes are needed in ebuilds). Overall this complifates further
80 ebuild development and verification of PRs.
81
82 In summary, IMHO the EGO_SUM way of handling of go packages has more
83 benefits than drawbacks compared to the vendor tarballs.
84
85 Cheers,
86 Zoltan
87
88 [1]
89 https://blogs.gentoo.org/mgorny/2021/02/19/the-modern-packagers-security-nightmare/

Replies

Subject Author
Re: [gentoo-dev] Proposal to undeprecate EGO_SUM Oskari Pirhonen <xxc3ncoredxx@×××××.com>
Re: [gentoo-dev] Proposal to undeprecate EGO_SUM William Hubbs <williamh@g.o>