1 |
On 10/20/2019 04:32, Michał Górny wrote: |
2 |
> On Sun, 2019-10-20 at 04:25 -0400, Joshua Kinard wrote: |
3 |
>> On 10/20/2019 02:51, Michał Górny wrote: |
4 |
>>> On Sat, 2019-10-19 at 19:24 -0400, Joshua Kinard wrote: |
5 |
>>>> On 10/18/2019 09:41, Michał Górny wrote: |
6 |
>>>>> Hi, everybody. |
7 |
>>>>> |
8 |
>>>>> It is my pleasure to announce that yesterday (EU) evening we've switched |
9 |
>>>>> to a new distfile mirror layout. Users will be switching to the new |
10 |
>>>>> layout either as they upgrade Portage to 2.3.77 or -- if they upgraded |
11 |
>>>>> already -- as their caches expire (24hrs). |
12 |
>>>>> |
13 |
>>>>> The new layout is mostly a bow towards mirror admins, for some of whom |
14 |
>>>>> having a 60000+ files in a single directory have been a problem. |
15 |
>>>>> However, I suppose some of you also found e.g. the directory index |
16 |
>>>>> hardly usable due to its size. |
17 |
>>>>> |
18 |
>>>>> Throughout a transitional period (whose exact length hasn't been decided |
19 |
>>>>> yet), both layouts will be available. Afterwards, the old layout will |
20 |
>>>>> be removed from mirrors. This has a few implications: |
21 |
>>>>> |
22 |
>>>>> 1. Users who don't upgrade their package managers in time will lose |
23 |
>>>>> the ability of fetching from Gentoo mirrors. This shouldn't be that |
24 |
>>>>> much of a problem given that the core software needed to upgrade Portage |
25 |
>>>>> should all have reliable upstream SRC_URIs. |
26 |
>>>>> |
27 |
>>>>> 2. mirror://gentoo/file URIs will stop working. While technically you |
28 |
>>>>> could use mirror://gentoo/XX/file, I'd rather recommend finally |
29 |
>>>>> discarding its usage and moving distfiles to devspace. |
30 |
>>>>> |
31 |
>>>>> 3. Directly fetching files from distfiles.gentoo.org will become |
32 |
>>>>> a little harder. To fetch a distfile named 'foo-1.tar.gz', you'd have |
33 |
>>>>> to use something like: |
34 |
>>>>> |
35 |
>>>>> $ printf '%s' foo-1.tar.gz | b2sum | cut -c1-2 |
36 |
>>>>> 1b |
37 |
>>>>> $ wget http://distfiles.gentoo.org/distfiles/1b/foo-1.tar.gz |
38 |
>>>>> ... |
39 |
>>>>> |
40 |
>>>>> |
41 |
>>>>> Alternatively, you can: |
42 |
>>>>> |
43 |
>>>>> $ wget http://distfiles.gentoo.org/distfiles/INDEX |
44 |
>>>>> |
45 |
>>>>> and grep for the right path there. This INDEX is also a more |
46 |
>>>>> lightweight alternative to HTML indexes generated by the servers. |
47 |
>>>>> |
48 |
>>>>> |
49 |
>>>>> If you're interested in more background details and some plots, see [1]. |
50 |
>>>>> |
51 |
>>>>> [1] https://dev.gentoo.org/~mgorny/articles/improving-distfile-mirror-structure.html |
52 |
>>>>> |
53 |
>>>> |
54 |
>>>> So the answer I didn't really see directly stated here is, where do new |
55 |
>>>> distfiles need to go //now//? E.g., if on woodpecker, I currently cp a |
56 |
>>>> distfile to /space/distfiles-local. What is the new directory I need to |
57 |
>>>> use? And if mirror://gentoo/${FOO} is going away, for the new distfiles |
58 |
>>>> target, what would be the applicable prefix to use? |
59 |
>>>> |
60 |
>>>> Directly using devspace seems like a bad idea, IMHO. Once long ago, we all |
61 |
>>>> got chastised for doing exactly that. Too much possibility of fragmentation |
62 |
>>>> as devs retire or package maintainership changes hands. |
63 |
>>> |
64 |
>>> Today you get chastised for using /space/distfiles-local and not |
65 |
>>> following policy changes. The devmanual states that it's deprecated |
66 |
>>> since at least 2011, and talks of using d.g.o [1]. |
67 |
>> |
68 |
>> I don't recall this change being added as far back as 2011. Maybe my memory |
69 |
>> is bad, but if it was done that long ago, it was done quietly, and it was |
70 |
>> not enforced. I checked my local mailing list archives for gentoo-dev and |
71 |
>> don't see any mention of distfiles-local being deprecated back then. Why |
72 |
>> has it taken 8 years for this to get addressed? |
73 |
> |
74 |
> Don't ask me. I think I was already taught to use d.g.o back when I was |
75 |
> recruited. |
76 |
> |
77 |
>> In any event, I still think using devspace is a bad idea. A centralized |
78 |
>> distfiles repo is what most other distros use, and it's what we should use. |
79 |
> |
80 |
> Talking doesn't make things happen. Coming up with good proposals that |
81 |
> address all the problems (e.g. those listed in devmanual) does. |
82 |
|
83 |
Proposing changes when a direction has already been decided, the rudder |
84 |
position changed, and engines put to full power is equally as pointless. |
85 |
You're the defacto captain of this ship lately. I expect you to not rock |
86 |
the boat too hard. This change is a pretty hard jolt, IMHO. |
87 |
|
88 |
|
89 |
>>>> I looked at the whitepaper'ish-like writeup, and I kinda don't like using a |
90 |
>>>> hash-based naming scheme on the new distfiles layout. I really kind prefer |
91 |
>>>> breaking the directories up based on the first letter of the distfiles in |
92 |
>>>> question, factoring case-sensitivity in (so you'd have 52 top-level |
93 |
>>>> directories for A-Z and a-z, plus 10 more for 0-9). Under each of those |
94 |
>>>> directories, additional subdirectories for the next few letters (say, |
95 |
>>>> letters 2-3). Yes, this leads to some orphan cases where a distfile might |
96 |
>>>> live on its own, but from a direct navigation standpoint, it's easy to find |
97 |
>>>> for someone browsing the distfiles server and easy to predict where a |
98 |
>>>> distfile is at. |
99 |
>>>> |
100 |
>>>> No math, statistical analysis, or deep-rooted knowledge of filesystems |
101 |
>>>> behind that paragraph. Just a plain old unfiltered opinion. Sometimes, I |
102 |
>>>> need to go get a distfile off the Gentoo mirrors, and being able to quickly |
103 |
>>>> find it in the mirror root is great. Having to do hash calculations to work |
104 |
>>>> out the file path will be *really* annoying. |
105 |
>>> |
106 |
>>> Your solution still doesn't solve the problem of having 8k-24k files |
107 |
>>> in a single directory, even if you use 7 letters of prefix. So it just |
108 |
>>> creates a lot of tiny directory noise for no practical gain. |
109 |
>> |
110 |
>> Why is having a max ~24k files in a directory a bad idea? Modern |
111 |
>> filesystems are more than capable of handling that. |
112 |
>> |
113 |
>> - ext4: unlimited files in a directory |
114 |
>> - xfs: virtually unlimited (hard limit of 2^64-1 total files per volume) |
115 |
>> - ntfs: 4,294,967,295 |
116 |
>> |
117 |
>> And 24k is a bit more than 1/3rd of all distfiles that we currently have. |
118 |
> |
119 |
> For the same reason having ~60k files in a directory was a problem. |
120 |
> There is really no point in changing anything if you change BIG_NUMBER |
121 |
> to SMALLER_BIG_NUMBER. |
122 |
|
123 |
That doesn't answer my question. Why is it a problem? What criteria are |
124 |
you using to decide that 24k is a "smaller big number"? Is there some issue |
125 |
highlighted by the mirror admins where having 24k files in a single |
126 |
directory offers no significant relief versus the current 60k files? |
127 |
|
128 |
|
129 |
>> Under which scenario do you wind up with 24k files in a single directory? I |
130 |
>> consider the tex package an outlier in this case (one package should not be |
131 |
>> the sole dictator of policy). |
132 |
> |
133 |
> Three versions of TeXLive living simultaneously. If one package falls |
134 |
> completely out of bounds, no problem is solved by the change, so what's |
135 |
> the point of making it? |
136 |
|
137 |
The problem in this case is with texlive, not our current, or future, |
138 |
distfiles methodology. Has anyone looked at how other distros deal with |
139 |
texlive? Has anyone complained or filed a bug to texlive developers |
140 |
upstream about their excessive amount of distfiles and the burden it places |
141 |
on distro maintainers? |
142 |
|
143 |
-- |
144 |
Joshua Kinard |
145 |
Gentoo/MIPS |
146 |
kumba@g.o |
147 |
rsa6144/5C63F4E3F5C6C943 2015-04-27 |
148 |
177C 1972 1FB8 F254 BAD0 3E72 5C63 F4E3 F5C6 C943 |
149 |
|
150 |
"The past tempts us, the present confuses us, the future frightens us. And |
151 |
our lives slip away, moment by moment, lost in that vast, terrible in-between." |
152 |
|
153 |
--Emperor Turhan, Centauri Republic |