* [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates
@ 2025-01-12 12:56 Michał Górny
2025-01-12 13:15 ` Agostino Sarubbo
2025-01-13 9:40 ` Florian Schmaus
0 siblings, 2 replies; 10+ messages in thread
From: Michał Górny @ 2025-01-12 12:56 UTC (permalink / raw
To: gentoo-dev; +Cc: Michał Górny
Emit a QA warning suggesting the use of crate tarball, when the package
in question uses 300 crates or more. Such a long crate lists cause
ebuilds and Manifests to grow very fast, causing significant space
consumption on end user systems (including users who are not using
the package in question) and git history growth. On top of that,
fetching that many crates takes significant time.
The number of 300 is pretty arbitrary, chosen approximately to match
Manifests that are over 100 KiB in size. We should probably look into
lowering in the future, as more packages are transitioned.
---
eclass/cargo.eclass | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/eclass/cargo.eclass b/eclass/cargo.eclass
index b1285e13a5b2..c8dd7c51bcfe 100644
--- a/eclass/cargo.eclass
+++ b/eclass/cargo.eclass
@@ -527,6 +527,12 @@ cargo_src_unpack() {
done < <(sha256sum -z "${crates[@]}" || die)
popd >/dev/null || die
+
+ if [[ ${#crates[@]} -ge 300 ]]; then
+ eqawarn "This package uses a very large number of CRATES. Please provide"
+ eqawarn "a crate tarball instead and fetch it via SRC_URI. You can use"
+ eqawarn "'pycargoebuild --crate-tarball' to create one."
+ fi
fi
cargo_gen_config
--
2.48.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates
2025-01-12 12:56 [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates Michał Górny
@ 2025-01-12 13:15 ` Agostino Sarubbo
2025-01-12 14:30 ` Alexey Sokolov
2025-01-13 9:40 ` Florian Schmaus
1 sibling, 1 reply; 10+ messages in thread
From: Agostino Sarubbo @ 2025-01-12 13:15 UTC (permalink / raw
To: gentoo-dev; +Cc: Michał Górny
[-- Attachment #1: Type: text/plain, Size: 534 bytes --]
On domenica 12 gennaio 2025 13:56:39 CET Michał Górny wrote:
> + if [[ ${#crates[@]} -ge 300 ]]; then
> + eqawarn "This package uses a very large number of
> CRATES. Please provide" + eqawarn "a crate tarball
> instead and fetch it via SRC_URI. You can use" +
> eqawarn "'pycargoebuild --crate-tarball' to create one." + fi
I would like to suggest to use "QA Notice: " prefix if you want to have
them reported.
Agostino
[-- Attachment #2: Type: text/html, Size: 1782 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates
2025-01-12 13:15 ` Agostino Sarubbo
@ 2025-01-12 14:30 ` Alexey Sokolov
2025-01-12 21:20 ` Ionen Wolkens
0 siblings, 1 reply; 10+ messages in thread
From: Alexey Sokolov @ 2025-01-12 14:30 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 562 bytes --]
12.01.2025 13:15, Agostino Sarubbo пишет:
> On domenica 12 gennaio 2025 13:56:39 CET Michał Górny wrote:
>
>> + if [[ ${#crates[@]} -ge 300 ]]; then
>
>> + eqawarn "This package uses a very large number of
>
>> CRATES. Please provide" + eqawarn "a crate tarball
>
>> instead and fetch it via SRC_URI. You can use" +
>
>> eqawarn "'pycargoebuild --crate-tarball' to create one." + fi
>
> I would like to suggest to use "QA Notice: " prefix if you want to have them reported.
>
> Agostino
Side question: maybe eqawarn should add such prefix automatically?
[-- Attachment #2: Type: text/html, Size: 1935 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates
2025-01-12 14:30 ` Alexey Sokolov
@ 2025-01-12 21:20 ` Ionen Wolkens
0 siblings, 0 replies; 10+ messages in thread
From: Ionen Wolkens @ 2025-01-12 21:20 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 1444 bytes --]
On Sun, Jan 12, 2025 at 02:30:10PM +0000, Alexey Sokolov wrote:
> 12.01.2025 13:15, Agostino Sarubbo пишет:
>
> > On domenica 12 gennaio 2025 13:56:39 CET Michał Górny wrote:
> >
> >> + if [[ ${#crates[@]} -ge 300 ]]; then
> >
> >> + eqawarn "This package uses a very large number of
> >
> >> CRATES. Please provide" + eqawarn "a crate tarball
> >
> >> instead and fetch it via SRC_URI. You can use" +
> >
> >> eqawarn "'pycargoebuild --crate-tarball' to create one." + fi
> >
> > I would like to suggest to use "QA Notice: " prefix if you want to have them reported.
> >
> > Agostino
>
> Side question: maybe eqawarn should add such prefix automatically?
In the context of automatically filing bugs, sometimes we also want to
warn for low priority things (e.g. either just something to be aware of
or something to ideally fix on bump when happen to see the warning)
without filing a hundred bugs.
So question is more whether we want this to happen here or not and put
pressure on maintainers (incl. proxied) to fix it asap.
From a technical standpoint, eqawarn would need to know when it's the
"header" of a notice (like optfeature_header) given we often have
several eqawarn in a row and "QA Notice:" for each line would be weird.
This means needing to modify all usage of it anyway which doesn't bring
much vs just inlining it unless we wanted to do something more special
with this.
--
ionen
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates
2025-01-12 12:56 [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates Michał Górny
2025-01-12 13:15 ` Agostino Sarubbo
@ 2025-01-13 9:40 ` Florian Schmaus
2025-01-13 13:23 ` orbea
2025-01-13 13:36 ` Michał Górny
1 sibling, 2 replies; 10+ messages in thread
From: Florian Schmaus @ 2025-01-13 9:40 UTC (permalink / raw
To: gentoo-dev, Michał Górny
On 12/01/2025 13.56, Michał Górny wrote:
> Emit a QA warning suggesting the use of crate tarball, when the package
> in question uses 300 crates or more. Such a long crate lists cause
> ebuilds and Manifests to grow very fast, causing significant space
> consumption on end user systems (including users who are not using
> the package in question) and git history growth. On top of that,
> fetching that many crates takes significant time.
>
> The number of 300 is pretty arbitrary, chosen approximately to match
> Manifests that are over 100 KiB in size. We should probably look into
> lowering in the future, as more packages are transitioned.
Thanks for your proposal. I know you wrote it because Gentoo is
important to you.
I am sorry, however, but the arbitrary limit you propose is harmful, and
its necessity is questionable.
It is unnecessary, at least in its current form, because the size growth
of Gentoo's package repository is manageable. See the previous analysis
for EGO_SUM [1].
What is more worrisome, however, is that it is harmful.
First, switching from individual crates to a single crate tarball
disallows inter-package crate archive reuse. Often, users will already
have the required crates downloaded because another installed package
used them. With an artificial create count limit, users must download
rather large crate tarballs, causing unnecessary traffic and increasing
the disk space on Gentoo's mirrors and end-user systems. The crate
tarballs quickly eat away the saved disk space in the ebuild repository.
Even worse, crate tarballs negatively impact the security of Gentoo
users as they make it harder to audit ebuilds, and third-party crate
tarballs add a further distinct party that can inject malicious code.
Considering the recent supply chain attacks, this alone is a show-stopper.
Why is this warning suddenly necessary? Did a user run into an issue
caused by more than 300 entries?
- Flow
1:
https://public-inbox.gentoo.org/gentoo-dev/6ed0f286-f9eb-9e93-4fec-296646f79871@gentoo.org/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates
2025-01-13 9:40 ` Florian Schmaus
@ 2025-01-13 13:23 ` orbea
2025-01-13 16:10 ` Ionen Wolkens
2025-01-13 13:36 ` Michał Górny
1 sibling, 1 reply; 10+ messages in thread
From: orbea @ 2025-01-13 13:23 UTC (permalink / raw
To: gentoo-dev
On Mon, 13 Jan 2025 10:40:30 +0100
Florian Schmaus <flow@gentoo.org> wrote:
> On 12/01/2025 13.56, Michał Górny wrote:
> > Emit a QA warning suggesting the use of crate tarball, when the
> > package in question uses 300 crates or more. Such a long crate
> > lists cause ebuilds and Manifests to grow very fast, causing
> > significant space consumption on end user systems (including users
> > who are not using the package in question) and git history growth.
> > On top of that, fetching that many crates takes significant time.
> >
> > The number of 300 is pretty arbitrary, chosen approximately to match
> > Manifests that are over 100 KiB in size. We should probably look
> > into lowering in the future, as more packages are transitioned.
> Thanks for your proposal. I know you wrote it because Gentoo is
> important to you.
>
> I am sorry, however, but the arbitrary limit you propose is harmful,
> and its necessity is questionable.
Its worth pointing out that is already being done in Gentoo, see
dev-util/maturin for one example.
>
> It is unnecessary, at least in its current form, because the size
> growth of Gentoo's package repository is manageable. See the previous
> analysis for EGO_SUM [1].
>
> What is more worrisome, however, is that it is harmful.
>
> First, switching from individual crates to a single crate tarball
> disallows inter-package crate archive reuse. Often, users will
> already have the required crates downloaded because another installed
> package used them. With an artificial create count limit, users must
> download rather large crate tarballs, causing unnecessary traffic and
> increasing the disk space on Gentoo's mirrors and end-user systems.
> The crate tarballs quickly eat away the saved disk space in the
> ebuild repository.
>
> Even worse, crate tarballs negatively impact the security of Gentoo
> users as they make it harder to audit ebuilds, and third-party crate
> tarballs add a further distinct party that can inject malicious code.
> Considering the recent supply chain attacks, this alone is a
> show-stopper.
>
> Why is this warning suddenly necessary? Did a user run into an issue
> caused by more than 300 entries?
>
> - Flow
>
> 1:
> https://public-inbox.gentoo.org/gentoo-dev/6ed0f286-f9eb-9e93-4fec-296646f79871@gentoo.org/
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates
2025-01-13 9:40 ` Florian Schmaus
2025-01-13 13:23 ` orbea
@ 2025-01-13 13:36 ` Michał Górny
2025-01-14 16:56 ` Florian Schmaus
1 sibling, 1 reply; 10+ messages in thread
From: Michał Górny @ 2025-01-13 13:36 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 1769 bytes --]
On Mon, 2025-01-13 at 10:40 +0100, Florian Schmaus wrote:
> First, switching from individual crates to a single crate tarball
> disallows inter-package crate archive reuse. Often, users will already
> have the required crates downloaded because another installed package
> used them. With an artificial create count limit, users must download
> rather large crate tarballs, causing unnecessary traffic and increasing
> the disk space on Gentoo's mirrors and end-user systems. The crate
> tarballs quickly eat away the saved disk space in the ebuild repository.
I'm sure you've also done a thorough analysis on how much crate reuse
actually happens, as well as of the impact of adding thousands of tiny
files to Gentoo mirrors, the inefficiency of fetching them one by one,
and especially how badly crates.io actually handles that.
I'm also sure you've done a thorough analysis of actual disk space use,
that also takes into consideration the space wasted by thousands of
tiny, inefficiently compressed files, compared to crate tarballs that
benefit both from much stronger compression algorithm, as well
as the opportunity to process much larger data blocks.
> Even worse, crate tarballs negatively impact the security of Gentoo
> users as they make it harder to audit ebuilds, and third-party crate
> tarballs add a further distinct party that can inject malicious code.
> Considering the recent supply chain attacks, this alone is a show-stopper.
`cargo audit` does not care about how crates are delivered to Gentoo
systems.
> Why is this warning suddenly necessary? Did a user run into an issue
> caused by more than 300 entries?
It is not "sudden". It is an ongoing effort.
--
Best regards,
Michał Górny
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 512 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates
2025-01-13 13:23 ` orbea
@ 2025-01-13 16:10 ` Ionen Wolkens
0 siblings, 0 replies; 10+ messages in thread
From: Ionen Wolkens @ 2025-01-13 16:10 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 1495 bytes --]
On Mon, Jan 13, 2025 at 05:23:54AM -0800, orbea wrote:
> On Mon, 13 Jan 2025 10:40:30 +0100
> Florian Schmaus <flow@gentoo.org> wrote:
>
> > On 12/01/2025 13.56, Michał Górny wrote:
> > > Emit a QA warning suggesting the use of crate tarball, when the
> > > package in question uses 300 crates or more. Such a long crate
> > > lists cause ebuilds and Manifests to grow very fast, causing
> > > significant space consumption on end user systems (including users
> > > who are not using the package in question) and git history growth.
> > > On top of that, fetching that many crates takes significant time.
> > >
> > > The number of 300 is pretty arbitrary, chosen approximately to match
> > > Manifests that are over 100 KiB in size. We should probably look
> > > into lowering in the future, as more packages are transitioned.
> > Thanks for your proposal. I know you wrote it because Gentoo is
> > important to you.
> >
> > I am sorry, however, but the arbitrary limit you propose is harmful,
> > and its necessity is questionable.
>
> Its worth pointing out that is already being done in Gentoo, see
> dev-util/maturin for one example.
ftr this is something I was planning to do either way, but kept
procrastinating given that package needs special handling to
handle crates used by tests (it builds separate rust packages
for its tests with their own crates). This just prompted me to
finally have a look before a potential warning hits.
--
ionen
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates
2025-01-13 13:36 ` Michał Górny
@ 2025-01-14 16:56 ` Florian Schmaus
2025-01-14 17:43 ` Michał Górny
0 siblings, 1 reply; 10+ messages in thread
From: Florian Schmaus @ 2025-01-14 16:56 UTC (permalink / raw
To: gentoo-dev, Michał Górny
[-- Attachment #1.1.1: Type: text/plain, Size: 2488 bytes --]
On 13/01/2025 14.36, Michał Górny wrote:
> On Mon, 2025-01-13 at 10:40 +0100, Florian Schmaus wrote:
>> First, switching from individual crates to a single crate tarball
>> disallows inter-package crate archive reuse. Often, users will already
>> have the required crates downloaded because another installed package
>> used them. With an artificial create count limit, users must download
>> rather large crate tarballs, causing unnecessary traffic and increasing
>> the disk space on Gentoo's mirrors and end-user systems. The crate
>> tarballs quickly eat away the saved disk space in the ebuild repository.
>
> I'm sure you've also done a thorough analysis on how much crate reuse
> actually happens, as well as of the impact of adding thousands of tiny
> files to Gentoo mirrors, the inefficiency of fetching them one by one,
> and especially how badly crates.io actually handles that.
>
> I'm also sure you've done a thorough analysis of actual disk space use,
> that also takes into consideration the space wasted by thousands of
> tiny, inefficiently compressed files, compared to crate tarballs that
> benefit both from much stronger compression algorithm, as well
> as the opportunity to process much larger data blocks.
If you have numbers backing up the claimed adverse effects, please share
them. I have demonstrated my calculations regarding ::gentoo size growth
and its negligible effect.
I think I should *not* be the one to prove that your change is required.
It is the responsibility of the person suggesting the change.
>> Even worse, crate tarballs negatively impact the security of Gentoo
>> users as they make it harder to audit ebuilds, and third-party crate
>> tarballs add a further distinct party that can inject malicious code.
>> Considering the recent supply chain attacks, this alone is a show-stopper.
>
> `cargo audit` does not care about how crates are delivered to Gentoo
> systems.
I was referring to "detecting malicious modifications" as auditing. What
'cargo audit' does is unrelated to this.
>> Why is this warning suddenly necessary? Did a user run into an issue
>> caused by more than 300 entries?
>
> It is not "sudden". It is an ongoing effort.
It certainly feels like all of a sudden to me. At least, as far as I
understand, there is no trigger event or similar. I am sorry, but
instead, it appears that you have decided that today is the day when we
need this.
- Flow
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 21567 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 618 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates
2025-01-14 16:56 ` Florian Schmaus
@ 2025-01-14 17:43 ` Michał Górny
0 siblings, 0 replies; 10+ messages in thread
From: Michał Górny @ 2025-01-14 17:43 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 525 bytes --]
On Tue, 2025-01-14 at 17:56 +0100, Florian Schmaus wrote:
>
> It certainly feels like all of a sudden to me. At least, as far as I
> understand, there is no trigger event or similar. I am sorry, but
> instead, it appears that you have decided that today is the day when we
> need this.
I know it's hard to imagine but some of us aren't paid to work
on Gentoo, and have to earn our living + deal with other
responsibilities, so we do things when we find time to do them.
--
Best regards,
Michał Górny
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 512 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-01-14 17:43 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-12 12:56 [gentoo-dev] [PATCH] cargo.eclass: Emit a warning if the package uses 300+ crates Michał Górny
2025-01-12 13:15 ` Agostino Sarubbo
2025-01-12 14:30 ` Alexey Sokolov
2025-01-12 21:20 ` Ionen Wolkens
2025-01-13 9:40 ` Florian Schmaus
2025-01-13 13:23 ` orbea
2025-01-13 16:10 ` Ionen Wolkens
2025-01-13 13:36 ` Michał Górny
2025-01-14 16:56 ` Florian Schmaus
2025-01-14 17:43 ` Michał Górny
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox