Gentoo Archives: gentoo-dev

From: "Michał Górny" <mgorny@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] New distfile mirror layout
Date: Sun, 20 Oct 2019 06:51:42
Message-Id: 752be6c75f337df8ee8124a804247d2fb27e73b4.camel@gentoo.org
In Reply to: Re: [gentoo-dev] New distfile mirror layout by Joshua Kinard
1 On Sat, 2019-10-19 at 19:24 -0400, Joshua Kinard wrote:
2 > On 10/18/2019 09:41, Michał Górny wrote:
3 > > Hi, everybody.
4 > >
5 > > It is my pleasure to announce that yesterday (EU) evening we've switched
6 > > to a new distfile mirror layout. Users will be switching to the new
7 > > layout either as they upgrade Portage to 2.3.77 or -- if they upgraded
8 > > already -- as their caches expire (24hrs).
9 > >
10 > > The new layout is mostly a bow towards mirror admins, for some of whom
11 > > having a 60000+ files in a single directory have been a problem.
12 > > However, I suppose some of you also found e.g. the directory index
13 > > hardly usable due to its size.
14 > >
15 > > Throughout a transitional period (whose exact length hasn't been decided
16 > > yet), both layouts will be available. Afterwards, the old layout will
17 > > be removed from mirrors. This has a few implications:
18 > >
19 > > 1. Users who don't upgrade their package managers in time will lose
20 > > the ability of fetching from Gentoo mirrors. This shouldn't be that
21 > > much of a problem given that the core software needed to upgrade Portage
22 > > should all have reliable upstream SRC_URIs.
23 > >
24 > > 2. mirror://gentoo/file URIs will stop working. While technically you
25 > > could use mirror://gentoo/XX/file, I'd rather recommend finally
26 > > discarding its usage and moving distfiles to devspace.
27 > >
28 > > 3. Directly fetching files from distfiles.gentoo.org will become
29 > > a little harder. To fetch a distfile named 'foo-1.tar.gz', you'd have
30 > > to use something like:
31 > >
32 > > $ printf '%s' foo-1.tar.gz | b2sum | cut -c1-2
33 > > 1b
34 > > $ wget http://distfiles.gentoo.org/distfiles/1b/foo-1.tar.gz
35 > > ...
36 > >
37 > >
38 > > Alternatively, you can:
39 > >
40 > > $ wget http://distfiles.gentoo.org/distfiles/INDEX
41 > >
42 > > and grep for the right path there. This INDEX is also a more
43 > > lightweight alternative to HTML indexes generated by the servers.
44 > >
45 > >
46 > > If you're interested in more background details and some plots, see [1].
47 > >
48 > > [1] https://dev.gentoo.org/~mgorny/articles/improving-distfile-mirror-structure.html
49 > >
50 >
51 > So the answer I didn't really see directly stated here is, where do new
52 > distfiles need to go //now//? E.g., if on woodpecker, I currently cp a
53 > distfile to /space/distfiles-local. What is the new directory I need to
54 > use? And if mirror://gentoo/${FOO} is going away, for the new distfiles
55 > target, what would be the applicable prefix to use?
56 >
57 > Directly using devspace seems like a bad idea, IMHO. Once long ago, we all
58 > got chastised for doing exactly that. Too much possibility of fragmentation
59 > as devs retire or package maintainership changes hands.
60
61 Today you get chastised for using /space/distfiles-local and not
62 following policy changes. The devmanual states that it's deprecated
63 since at least 2011, and talks of using d.g.o [1].
64
65 > I looked at the whitepaper'ish-like writeup, and I kinda don't like using a
66 > hash-based naming scheme on the new distfiles layout. I really kind prefer
67 > breaking the directories up based on the first letter of the distfiles in
68 > question, factoring case-sensitivity in (so you'd have 52 top-level
69 > directories for A-Z and a-z, plus 10 more for 0-9). Under each of those
70 > directories, additional subdirectories for the next few letters (say,
71 > letters 2-3). Yes, this leads to some orphan cases where a distfile might
72 > live on its own, but from a direct navigation standpoint, it's easy to find
73 > for someone browsing the distfiles server and easy to predict where a
74 > distfile is at.
75 >
76 > No math, statistical analysis, or deep-rooted knowledge of filesystems
77 > behind that paragraph. Just a plain old unfiltered opinion. Sometimes, I
78 > need to go get a distfile off the Gentoo mirrors, and being able to quickly
79 > find it in the mirror root is great. Having to do hash calculations to work
80 > out the file path will be *really* annoying.
81
82 Your solution still doesn't solve the problem of having 8k-24k files
83 in a single directory, even if you use 7 letters of prefix. So it just
84 creates a lot of tiny directory noise for no practical gain.
85
86 [1] https://devmanual.gentoo.org/general-concepts/mirrors/index.html#suitable-download-hosts
87
88 --
89 Best regards,
90 Michał Górny

Attachments

File name MIME type
signature.asc application/pgp-signature

Replies

Subject Author
Re: [gentoo-dev] New distfile mirror layout Joshua Kinard <kumba@g.o>
Re: [gentoo-dev] New distfile mirror layout Richard Yao <ryao@g.o>
Re: [gentoo-dev] New distfile mirror layout "Chí-Thanh Christopher Nguyễn" <chithanh@g.o>