1 |
Hi All, |
2 |
|
3 |
This doesn't directly affect me. Nor am I familiar with the mechanisms. |
4 |
|
5 |
Perhaps it's worthwhile to suggest that EGO_SUM itself may be |
6 |
externalized. I don't know what goes in here, and this will likely |
7 |
require help from portage itself, so may not be directly viable. |
8 |
|
9 |
What if portage had a feature whereby a SRC_URI list could be downloaded |
10 |
as a SRC_URI itself? In other words: |
11 |
|
12 |
SRC_URI_INDIRECT="https://wherever/lists_for_some_go_package.txt" |
13 |
|
14 |
Where that file itself contains lines for entries that would normally go |
15 |
into SRC_URI (directly or indirectly via EGO_SUM from what I can |
16 |
deduce). Something like: |
17 |
|
18 |
https://www.upstream.com/downloads/package-version.tar.gz => |
19 |
fneh.tar.gz|manifest portion goes here |
20 |
|
21 |
Where manifest portion would assume DIST and fneh.tar.gz, so would start |
22 |
with the filesize in bytes, followed by checksum value pairs as per |
23 |
current Manifest files. |
24 |
|
25 |
Since users may want to know how big the downloads for a specific ebuild |
26 |
is, some process to generate these external manifests may be in order, |
27 |
and to subsequently store the size of these indirect downloads |
28 |
themselves in the local manifest, so in the local Manifest, something like: |
29 |
|
30 |
IDIST lists_for_some_go_package.txt direct_size indirect_size CHECKSUM |
31 |
value CHECKSUM value. |
32 |
|
33 |
I realise this idea isn't immediately feasible, and perhaps not at all, |
34 |
presented here since perhaps it could spark an idea for someone else. |
35 |
It sounds like this is the problem that the vendor tarball tries to |
36 |
solve, but that that introduces a trust issue - not sure this exactly |
37 |
goes away but at a minimum we're now verifying download locations again |
38 |
(as per EGO_SUM or just SRC_URI in general) rather than code tarballs |
39 |
containing many many times more code than download locations. |
40 |
|
41 |
Given: |
42 |
|
43 |
jkroon@plastiekpoot ~ $ du -sh /var/db/repos/gentoo/ |
44 |
644M /var/db/repos/gentoo/ |
45 |
|
46 |
I'm not against exploding this by another 200 or even 300 MB personally, |
47 |
but I do agree that pointless bloat is bad, and ideally we want to |
48 |
shrink the size requirements of the portage tree rather than enlarge. |
49 |
|
50 |
Kind Regards, |
51 |
Jaco |
52 |
|
53 |
On 2022/09/30 15:57, Florian Schmaus wrote: |
54 |
|
55 |
> On 28/09/2022 23.23, John Helmert III wrote: |
56 |
>> On Wed, Sep 28, 2022 at 05:28:00PM +0200, Florian Schmaus wrote: |
57 |
>>> I would like to continue discussing whether we should entirely |
58 |
>>> deprecate |
59 |
>>> EGO_SUM without the desire to offend anyone. |
60 |
>>> |
61 |
>>> We now have a pending GitHub PR that bumps restic to 0.14 [1]. |
62 |
>>> Restic is |
63 |
>>> a very popular backup software written in Go. The PR drops EGO_SUM in |
64 |
>>> favor of a vendor tarball created by the proxied maintainer. However, I |
65 |
>>> am unaware of any tool that lets you practically audit the 35 MiB |
66 |
>>> source |
67 |
>>> contained in the tarball. And even if such a tool exists, this would |
68 |
>>> mean another manual step is required, which is, potentially, skipped |
69 |
>>> most of the time, weakening our user's security. This is because I |
70 |
>>> believe neither our tooling, e.g., go-mod.eclass, nor any Golang |
71 |
>>> tooling, does authenticate the contents of the vendor tarball against |
72 |
>>> upstream's go.sum. But please correct me if I am wrong. |
73 |
>>> |
74 |
>>> I wonder if we can reach consensus around un-depreacting EGO_SUM, but |
75 |
>>> discouraging its usage in certain situations. That is, provide EGO_SUM |
76 |
>>> as option but disallow its use if |
77 |
>>> 1.) *upstream* provides a vendor tarball |
78 |
>>> 2.) the number of EGO_SUM entries exceeds 1000 and a Gentoo developer |
79 |
>>> maintains the package |
80 |
>>> 3.) the number of EGO_SUM entries exceeds 1500 and a proxied maintainer |
81 |
>>> maintains the package |
82 |
>> |
83 |
>> I'm not sure I agree on these limits, given the authenticity problem |
84 |
>> exists regardless of how many dependencies there are. |
85 |
> |
86 |
> It's not really about authentication, you always have to trust |
87 |
> upstream to some degree (unless you audit every line of code). But I |
88 |
> believe that code distributed via official channels is viewed by more |
89 |
> eyes and significantly more secure. |
90 |
> |
91 |
> EGO_SUM entries are directly fetched from the official distribution |
92 |
> channels of Golang. Hence, there is a higher chance that malicious |
93 |
> code in one of those is detected faster, simply because they are |
94 |
> consumed by more entities. Compared to the dependency tarball that is |
95 |
> just used by Gentoo. In contrast to the official sources, "nobody" is |
96 |
> looking at the code inside the tarball. |
97 |
> |
98 |
> For proxied packages, where the dependency tarball is published by the |
99 |
> proxied maintainer, the tarball also allows another entity to inject |
100 |
> code into the final result of the package. And compared to a few small |
101 |
> patches in FILESDIR, such a dependency tarball requires more effort to |
102 |
> review. This further weakens security in comparison to EGO_SUM. |
103 |
> |
104 |
> - Flow |