1 |
On 10/20/2019 02:51, Michał Górny wrote: |
2 |
> On Sat, 2019-10-19 at 19:24 -0400, Joshua Kinard wrote: |
3 |
>> On 10/18/2019 09:41, Michał Górny wrote: |
4 |
>>> Hi, everybody. |
5 |
>>> |
6 |
>>> It is my pleasure to announce that yesterday (EU) evening we've switched |
7 |
>>> to a new distfile mirror layout. Users will be switching to the new |
8 |
>>> layout either as they upgrade Portage to 2.3.77 or -- if they upgraded |
9 |
>>> already -- as their caches expire (24hrs). |
10 |
>>> |
11 |
>>> The new layout is mostly a bow towards mirror admins, for some of whom |
12 |
>>> having a 60000+ files in a single directory have been a problem. |
13 |
>>> However, I suppose some of you also found e.g. the directory index |
14 |
>>> hardly usable due to its size. |
15 |
>>> |
16 |
>>> Throughout a transitional period (whose exact length hasn't been decided |
17 |
>>> yet), both layouts will be available. Afterwards, the old layout will |
18 |
>>> be removed from mirrors. This has a few implications: |
19 |
>>> |
20 |
>>> 1. Users who don't upgrade their package managers in time will lose |
21 |
>>> the ability of fetching from Gentoo mirrors. This shouldn't be that |
22 |
>>> much of a problem given that the core software needed to upgrade Portage |
23 |
>>> should all have reliable upstream SRC_URIs. |
24 |
>>> |
25 |
>>> 2. mirror://gentoo/file URIs will stop working. While technically you |
26 |
>>> could use mirror://gentoo/XX/file, I'd rather recommend finally |
27 |
>>> discarding its usage and moving distfiles to devspace. |
28 |
>>> |
29 |
>>> 3. Directly fetching files from distfiles.gentoo.org will become |
30 |
>>> a little harder. To fetch a distfile named 'foo-1.tar.gz', you'd have |
31 |
>>> to use something like: |
32 |
>>> |
33 |
>>> $ printf '%s' foo-1.tar.gz | b2sum | cut -c1-2 |
34 |
>>> 1b |
35 |
>>> $ wget http://distfiles.gentoo.org/distfiles/1b/foo-1.tar.gz |
36 |
>>> ... |
37 |
>>> |
38 |
>>> |
39 |
>>> Alternatively, you can: |
40 |
>>> |
41 |
>>> $ wget http://distfiles.gentoo.org/distfiles/INDEX |
42 |
>>> |
43 |
>>> and grep for the right path there. This INDEX is also a more |
44 |
>>> lightweight alternative to HTML indexes generated by the servers. |
45 |
>>> |
46 |
>>> |
47 |
>>> If you're interested in more background details and some plots, see [1]. |
48 |
>>> |
49 |
>>> [1] https://dev.gentoo.org/~mgorny/articles/improving-distfile-mirror-structure.html |
50 |
>>> |
51 |
>> |
52 |
>> So the answer I didn't really see directly stated here is, where do new |
53 |
>> distfiles need to go //now//? E.g., if on woodpecker, I currently cp a |
54 |
>> distfile to /space/distfiles-local. What is the new directory I need to |
55 |
>> use? And if mirror://gentoo/${FOO} is going away, for the new distfiles |
56 |
>> target, what would be the applicable prefix to use? |
57 |
>> |
58 |
>> Directly using devspace seems like a bad idea, IMHO. Once long ago, we all |
59 |
>> got chastised for doing exactly that. Too much possibility of fragmentation |
60 |
>> as devs retire or package maintainership changes hands. |
61 |
> |
62 |
> Today you get chastised for using /space/distfiles-local and not |
63 |
> following policy changes. The devmanual states that it's deprecated |
64 |
> since at least 2011, and talks of using d.g.o [1]. |
65 |
|
66 |
I don't recall this change being added as far back as 2011. Maybe my memory |
67 |
is bad, but if it was done that long ago, it was done quietly, and it was |
68 |
not enforced. I checked my local mailing list archives for gentoo-dev and |
69 |
don't see any mention of distfiles-local being deprecated back then. Why |
70 |
has it taken 8 years for this to get addressed? |
71 |
|
72 |
In any event, I still think using devspace is a bad idea. A centralized |
73 |
distfiles repo is what most other distros use, and it's what we should use. |
74 |
|
75 |
|
76 |
>> I looked at the whitepaper'ish-like writeup, and I kinda don't like using a |
77 |
>> hash-based naming scheme on the new distfiles layout. I really kind prefer |
78 |
>> breaking the directories up based on the first letter of the distfiles in |
79 |
>> question, factoring case-sensitivity in (so you'd have 52 top-level |
80 |
>> directories for A-Z and a-z, plus 10 more for 0-9). Under each of those |
81 |
>> directories, additional subdirectories for the next few letters (say, |
82 |
>> letters 2-3). Yes, this leads to some orphan cases where a distfile might |
83 |
>> live on its own, but from a direct navigation standpoint, it's easy to find |
84 |
>> for someone browsing the distfiles server and easy to predict where a |
85 |
>> distfile is at. |
86 |
>> |
87 |
>> No math, statistical analysis, or deep-rooted knowledge of filesystems |
88 |
>> behind that paragraph. Just a plain old unfiltered opinion. Sometimes, I |
89 |
>> need to go get a distfile off the Gentoo mirrors, and being able to quickly |
90 |
>> find it in the mirror root is great. Having to do hash calculations to work |
91 |
>> out the file path will be *really* annoying. |
92 |
> |
93 |
> Your solution still doesn't solve the problem of having 8k-24k files |
94 |
> in a single directory, even if you use 7 letters of prefix. So it just |
95 |
> creates a lot of tiny directory noise for no practical gain. |
96 |
|
97 |
Why is having a max ~24k files in a directory a bad idea? Modern |
98 |
filesystems are more than capable of handling that. |
99 |
|
100 |
- ext4: unlimited files in a directory |
101 |
- xfs: virtually unlimited (hard limit of 2^64-1 total files per volume) |
102 |
- ntfs: 4,294,967,295 |
103 |
|
104 |
And 24k is a bit more than 1/3rd of all distfiles that we currently have. |
105 |
Under which scenario do you wind up with 24k files in a single directory? I |
106 |
consider the tex package an outlier in this case (one package should not be |
107 |
the sole dictator of policy). |
108 |
|
109 |
-- |
110 |
Joshua Kinard |
111 |
Gentoo/MIPS |
112 |
kumba@g.o |
113 |
rsa6144/5C63F4E3F5C6C943 2015-04-27 |
114 |
177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943 |
115 |
|
116 |
"The past tempts us, the present confuses us, the future frightens us. And |
117 |
our lives slip away, moment by moment, lost in that vast, terrible in-between." |
118 |
|
119 |
--Emperor Turhan, Centauri Republic |