1 |
On Sat, 2019-10-19 at 19:24 -0400, Joshua Kinard wrote: |
2 |
> On 10/18/2019 09:41, Michał Górny wrote: |
3 |
> > Hi, everybody. |
4 |
> > |
5 |
> > It is my pleasure to announce that yesterday (EU) evening we've switched |
6 |
> > to a new distfile mirror layout. Users will be switching to the new |
7 |
> > layout either as they upgrade Portage to 2.3.77 or -- if they upgraded |
8 |
> > already -- as their caches expire (24hrs). |
9 |
> > |
10 |
> > The new layout is mostly a bow towards mirror admins, for some of whom |
11 |
> > having a 60000+ files in a single directory have been a problem. |
12 |
> > However, I suppose some of you also found e.g. the directory index |
13 |
> > hardly usable due to its size. |
14 |
> > |
15 |
> > Throughout a transitional period (whose exact length hasn't been decided |
16 |
> > yet), both layouts will be available. Afterwards, the old layout will |
17 |
> > be removed from mirrors. This has a few implications: |
18 |
> > |
19 |
> > 1. Users who don't upgrade their package managers in time will lose |
20 |
> > the ability of fetching from Gentoo mirrors. This shouldn't be that |
21 |
> > much of a problem given that the core software needed to upgrade Portage |
22 |
> > should all have reliable upstream SRC_URIs. |
23 |
> > |
24 |
> > 2. mirror://gentoo/file URIs will stop working. While technically you |
25 |
> > could use mirror://gentoo/XX/file, I'd rather recommend finally |
26 |
> > discarding its usage and moving distfiles to devspace. |
27 |
> > |
28 |
> > 3. Directly fetching files from distfiles.gentoo.org will become |
29 |
> > a little harder. To fetch a distfile named 'foo-1.tar.gz', you'd have |
30 |
> > to use something like: |
31 |
> > |
32 |
> > $ printf '%s' foo-1.tar.gz | b2sum | cut -c1-2 |
33 |
> > 1b |
34 |
> > $ wget http://distfiles.gentoo.org/distfiles/1b/foo-1.tar.gz |
35 |
> > ... |
36 |
> > |
37 |
> > |
38 |
> > Alternatively, you can: |
39 |
> > |
40 |
> > $ wget http://distfiles.gentoo.org/distfiles/INDEX |
41 |
> > |
42 |
> > and grep for the right path there. This INDEX is also a more |
43 |
> > lightweight alternative to HTML indexes generated by the servers. |
44 |
> > |
45 |
> > |
46 |
> > If you're interested in more background details and some plots, see [1]. |
47 |
> > |
48 |
> > [1] https://dev.gentoo.org/~mgorny/articles/improving-distfile-mirror-structure.html |
49 |
> > |
50 |
> |
51 |
> So the answer I didn't really see directly stated here is, where do new |
52 |
> distfiles need to go //now//? E.g., if on woodpecker, I currently cp a |
53 |
> distfile to /space/distfiles-local. What is the new directory I need to |
54 |
> use? And if mirror://gentoo/${FOO} is going away, for the new distfiles |
55 |
> target, what would be the applicable prefix to use? |
56 |
> |
57 |
> Directly using devspace seems like a bad idea, IMHO. Once long ago, we all |
58 |
> got chastised for doing exactly that. Too much possibility of fragmentation |
59 |
> as devs retire or package maintainership changes hands. |
60 |
|
61 |
Today you get chastised for using /space/distfiles-local and not |
62 |
following policy changes. The devmanual states that it's deprecated |
63 |
since at least 2011, and talks of using d.g.o [1]. |
64 |
|
65 |
> I looked at the whitepaper'ish-like writeup, and I kinda don't like using a |
66 |
> hash-based naming scheme on the new distfiles layout. I really kind prefer |
67 |
> breaking the directories up based on the first letter of the distfiles in |
68 |
> question, factoring case-sensitivity in (so you'd have 52 top-level |
69 |
> directories for A-Z and a-z, plus 10 more for 0-9). Under each of those |
70 |
> directories, additional subdirectories for the next few letters (say, |
71 |
> letters 2-3). Yes, this leads to some orphan cases where a distfile might |
72 |
> live on its own, but from a direct navigation standpoint, it's easy to find |
73 |
> for someone browsing the distfiles server and easy to predict where a |
74 |
> distfile is at. |
75 |
> |
76 |
> No math, statistical analysis, or deep-rooted knowledge of filesystems |
77 |
> behind that paragraph. Just a plain old unfiltered opinion. Sometimes, I |
78 |
> need to go get a distfile off the Gentoo mirrors, and being able to quickly |
79 |
> find it in the mirror root is great. Having to do hash calculations to work |
80 |
> out the file path will be *really* annoying. |
81 |
|
82 |
Your solution still doesn't solve the problem of having 8k-24k files |
83 |
in a single directory, even if you use 7 letters of prefix. So it just |
84 |
creates a lot of tiny directory noise for no practical gain. |
85 |
|
86 |
[1] https://devmanual.gentoo.org/general-concepts/mirrors/index.html#suitable-download-hosts |
87 |
|
88 |
-- |
89 |
Best regards, |
90 |
Michał Górny |