1 |
Hi, everyone. |
2 |
|
3 |
Here's a little new something we've been silently debating back |
4 |
in the day, then forgotten about it, then I've written a GLEP about it. |
5 |
The number's not official yet. |
6 |
|
7 |
HTML (with plots): https://dev.gentoo.org/~mgorny/tmp/glep-0075.html |
8 |
|
9 |
--- |
10 |
GLEP: 75 |
11 |
Title: Split distfile mirror directory structure |
12 |
Author: |
13 |
Michał Górny <mgorny@g.o>, |
14 |
Robin H. Johnson <robbat2@gento |
15 |
o.org> |
16 |
Type: Standards Track |
17 |
Status: Draft |
18 |
Version: 1 |
19 |
Created: 2018-01-26 |
20 |
Las |
21 |
t-Modified: 2018-01-27 |
22 |
Post-History: 2018-01-27 |
23 |
Content-Type: text/x-rst |
24 |
-- |
25 |
- |
26 |
|
27 |
Abstract |
28 |
======== |
29 |
This GLEP describes the procedure for splitting the distfiles on mirrors |
30 |
into multiple directories with the goal of reducing the number of files |
31 |
in a single directory. |
32 |
|
33 |
|
34 |
Motivation |
35 |
========== |
36 |
At the moment, both the package manager and Gentoo mirrors use flat |
37 |
directory structure to store files. While this solution usually works, |
38 |
it does not scale well. Directories with large number of files usually |
39 |
have significant performance penalty, unless using filesystems |
40 |
specifically designed for that purpose. |
41 |
|
42 |
According to the Gentoo repository state at 2018-01-26 16:23, there |
43 |
was a total of 62652 unique distfiles in the repository. While |
44 |
the users realistically hit around 10% of that, distfile mirrors often |
45 |
hold even more files --- more so if old distfiles are not wiped |
46 |
immediately. |
47 |
|
48 |
While all filesystems used on Linux boxes should be able to cope with |
49 |
a number that large, they may suffer a performance penalty with even |
50 |
a few thousand files. Additionally, if mirrors enable directory indexes |
51 |
then generating the index imposes both a significant server overhead |
52 |
and a significant data transfer. At this moment, the index |
53 |
of distfiles.gentoo.org has around 17 MiB. |
54 |
|
55 |
Splitting the distfiles into multiple directories makes it possible |
56 |
to avoid those problems by reducing the number of files in a single |
57 |
directory. For example, splitting the forementioned set of distfiles |
58 |
into 16 directories that are roughly balanced allows to reduce |
59 |
the number of files in a single directory to around 4000. Splitting |
60 |
them further into 256 directories (16x16) results in 200-300 files |
61 |
per directory which should avoid any performance problems long-term, |
62 |
even assuming 300% growth of number of distfiles. |
63 |
|
64 |
|
65 |
Specification |
66 |
============= |
67 |
Mirror layout file |
68 |
------------------ |
69 |
A mirror adhering to this specification should include a ``layout.conf`` |
70 |
file in the top distfile directory. This file uses the format |
71 |
derived from the freedesktop Desktop Entry Specification file format |
72 |
[#DESKTOP_FORMAT]_. |
73 |
|
74 |
Before using each Gentoo mirror, the package manager should attempt |
75 |
to fetch (update) its ``layout.conf`` file and process it to determine |
76 |
how to use the mirror. If the file is not present, the package manager |
77 |
should behave as if it were empty. |
78 |
|
79 |
The package manager should recognize the sections and keys listed below. |
80 |
It should ignore any unrecognized sections or keys --- the format |
81 |
is intended to account for future extensions. |
82 |
|
83 |
This specification currently defines one section: ``[structure]``. |
84 |
This section defines one or more repository structure definitions |
85 |
using sequential integer keys. The definition keyed as ``0`` |
86 |
is the most preferred structure. The package manager should use |
87 |
the first structure format it recognizes as supported, and ignore any |
88 |
it does not recognize. If this section is not present, the package |
89 |
manager should behave as if only ``flat`` structure were supported. |
90 |
|
91 |
The following structure definitions are supported: |
92 |
|
93 |
* ``flat`` to indicate the traditional flat structure where all |
94 |
distfiles are located in the top directory, |
95 |
|
96 |
* ``filename-hash <algorithm> <cutoffs>`` to indicate the `filename |
97 |
hash structure`_ explained below. |
98 |
|
99 |
|
100 |
Filename hash structure |
101 |
----------------------- |
102 |
When using the filename hash structure, the distfiles are split |
103 |
into directories whose names are derived from the hash of distfile |
104 |
filename. This structure has two parameters: *algorithm name* |
105 |
and *cutoffs* list. |
106 |
|
107 |
The algorithm name must correspond to a valid Manifest hash name. |
108 |
An informational list of hashes is included in GLEP 74 [#GLEP74]_, |
109 |
and the policies for introducing new hashes are covered by GLEP 59 |
110 |
[#GLEP59]_. |
111 |
|
112 |
The cutoffs list specifies one or more integers separated by colons |
113 |
(``:``), indicating the number of bits (starting with the most |
114 |
significant bit) of the hash used to form subsequent subdirectory names. |
115 |
For example, the list of ``2:4`` would indicate that top-level directory |
116 |
names are formed using 2 most significant bits of the hash (resulting |
117 |
in 2² = 4 directories), and each of this directories would have |
118 |
subdirectories formed using the next 4 bits of the hash (resulting |
119 |
in 2⁴ = 8 subdirectories each). |
120 |
|
121 |
The exact algorithm for determining the distfile location follows: |
122 |
|
123 |
1. Let the distfile filename be **F**. |
124 |
|
125 |
2. Compute the hash of **F** and store its binary value as **H**. |
126 |
|
127 |
3. For each integer **C** in cutoff list: |
128 |
|
129 |
a. Take **C** most significant bits of **H** and store them as **V**. |
130 |
|
131 |
b. Convert **V** into hexadecimal integer, left padded with zeros |
132 |
to **C/4** digits (rounded up) and append it to the path, followed |
133 |
by the path separator. |
134 |
|
135 |
c. Shift **H** left **C** bits. |
136 |
|
137 |
4. Finally, append **F** to the obtained path. |
138 |
|
139 |
In particular, note that when using nested directories |
140 |
the subdirectories do not repeat the hash bits used in parent directory. |
141 |
|
142 |
|
143 |
Migrating mirrors to the hashed structure |
144 |
----------------------------------------- |
145 |
Since all distfile mirrors sync to the master Gentoo mirror, it should |
146 |
be enough to perform all the needed changes on the master mirror |
147 |
and wait for other mirrors to sync. The following procedure |
148 |
is recommended: |
149 |
|
150 |
1. Include the initial ``layout.conf`` listing only ``flat`` layout. |
151 |
|
152 |
2. Create the new structure alongside the flat structure. Wait for |
153 |
mirrors to sync. |
154 |
|
155 |
3. Once all mirrors receive the new structure, update ``layout.conf`` |
156 |
to list the ``filename-hash`` structure. |
157 |
|
158 |
4. Once a version of Portage supporting the new structure is stable long |
159 |
enough, remove the fallback ``flat`` structure from ``layout.conf`` |
160 |
and duplicate distfiles. |
161 |
|
162 |
This implies that during the migration period the distfiles will |
163 |
be stored duplicated on the mirrors and therefore will occupy twice |
164 |
as much space. Technically, this could be avoided either by using |
165 |
hard links or symbolic links. |
166 |
|
167 |
The hard link solution allows us to save space on the master mirror. |
168 |
Additionally, if ``-H`` option is used by the mirrors it avoids |
169 |
transferring existing files again. However, this option is known |
170 |
to be expensive and could cause significant server load. Without it, |
171 |
all mirrors need to transfer a second copy of all the existing files. |
172 |
|
173 |
The symbolic link solution could be more reliable if we could rely |
174 |
on mirrors using the ``--links`` rsync option. Without that, symbolic |
175 |
links are not transferred at all. |
176 |
|
177 |
|
178 |
Using hashed structure for local distfiles |
179 |
------------------------------------------ |
180 |
The hashed structure defined above could also be used for local distfile |
181 |
storage as used by the package manager. For this to work, the package |
182 |
manager authors need to ensure that: |
183 |
|
184 |
a. The ``${DISTDIR}`` variable in the ebuild scope points to a temporary |
185 |
directory where distfiles specific to the package are linked |
186 |
in a flat structure. |
187 |
|
188 |
b. All tools are updated to support the nested structure. |
189 |
|
190 |
c. The package manager provides a tool for users to easily manipulate |
191 |
distfiles, in particular to add distfiles for fetch-restricted |
192 |
packages into an appropriate subdirectory. |
193 |
|
194 |
For extended compatibility, the package manager may support finding |
195 |
distfiles in flat and nested structure simultaneously. |
196 |
|
197 |
|
198 |
Rationale |
199 |
========= |
200 |
Algorithm for splitting distfiles |
201 |
--------------------------------- |
202 |
In the original debate that occurred in bug #534528 [#BUG534528]_, |
203 |
three possible solutions for splitting distfiles were listed: |
204 |
|
205 |
a. using initial portion of filename, |
206 |
|
207 |
b. using initial portion of file hash, |
208 |
|
209 |
c. using initial portion of filename hash. |
210 |
|
211 |
The significant advantage of the filename option was simplicity. With |
212 |
that solution, the users could easily determine the correct subdirectory |
213 |
themselves. However, it's significant disadvantage was very uneven |
214 |
shuffling of data. In particular, the TeΧ Live packages alone count |
215 |
almost 23500 distfiles and all use a common prefix, making it impossible |
216 |
to split them further. |
217 |
|
218 |
The alternate option of using file hash has the advantage of having |
219 |
a more balanced split. Furthermore, since hashes are stored |
220 |
in Manifests using them is zero-cost. However, this solution has two |
221 |
significant disadvantages: |
222 |
|
223 |
1. The hash values are unknown for newly-downloaded distfiles, so |
224 |
``repoman`` (or an equivalent tool) would have to use a temporary |
225 |
directory before locating the file in appropriate subdirectory. |
226 |
|
227 |
2. User-provided distfiles (e.g. for fetch-restricted packages) with |
228 |
hash mismatches would be placed in the wrong subdirectory, |
229 |
potentially causing confusing errors. |
230 |
|
231 |
Using filename hashes has proven to provide a similar balance |
232 |
to using file hashes. Furthermore, since filenames are known up front |
233 |
this solution does not suffer from the both listed problems. While |
234 |
hashes need to be computed manually, hashing short string should not |
235 |
cause any performance problems. |
236 |
|
237 |
.. figure:: glep-0075-extras/by-filename.png |
238 |
|
239 |
Distribution of distfiles by first character of filenames |
240 |
|
241 |
.. figure:: glep-0075-extras/by-csum.png |
242 |
|
243 |
Distribution of distfiles by first hex-digit of checksum |
244 |
(x --- content checksum, + --- filename checksum) |
245 |
|
246 |
.. figure:: glep-0075-extras/by-csum2.png |
247 |
|
248 |
Distribution of distfiles by two first hex-digits of checksum |
249 |
(x --- content checksum, + --- filename checksum) |
250 |
|
251 |
|
252 |
Layout file |
253 |
----------- |
254 |
The presence of control file has been suggested in the original |
255 |
discussion. Its main purpose is to let package managers cleanly handle |
256 |
the migration and detect how to correctly query the mirrors throughout |
257 |
it. Furthermore, it makes future changes easier. |
258 |
|
259 |
The format lines specifically mean to hardcode as little about |
260 |
the actual algorithm as possible. Therefore, we can easily change |
261 |
the hash used or the exact split structure without having to update |
262 |
the package managers or even provide a compatibility layout. |
263 |
|
264 |
The file is also open for future extensions to provide additional mirror |
265 |
metadata. However, no clear use for that has been determined so far. |
266 |
|
267 |
|
268 |
Hash algorithm |
269 |
-------------- |
270 |
The hash algorithm support is fully deferred to the existing code |
271 |
in the package managers that is required to handle Manifests. |
272 |
In particular, it is recommended to reuse one of the hashes that are |
273 |
used in Manifest entries at the time. This avoids code duplication |
274 |
and reuses an existing mechanism to handle hash upgrades. |
275 |
|
276 |
During the discussion, it has been pointed that this particular use case |
277 |
does not require a cryptographically strong hash and a faster algorithm |
278 |
could be used instead. However, given the short length of hashed |
279 |
strings performance is not a problem, and speed does not justify |
280 |
the resulting code duplication. |
281 |
|
282 |
It has also been pointed out that e.g. the BLAKE2 hash family provides |
283 |
the ability of creating arbitrary length hashes instead of truncating |
284 |
the standard-length hash. However, not all implementations of BLAKE2 |
285 |
support that and relying on it could reduce portability for no apparent |
286 |
gain. |
287 |
|
288 |
|
289 |
Backwards Compatibility |
290 |
======================= |
291 |
Mirror compatibility |
292 |
-------------------- |
293 |
The mirrored files are propagated to other mirrors as opaque directory |
294 |
structure. Therefore, there are no backwards compatibility concerns |
295 |
on the mirroring side. |
296 |
|
297 |
Backwards compatibility with existing clients is detailed |
298 |
in `migrating mirrors to the hashed structure`_ section. Backwards |
299 |
compatibility with the old clients will be provided by preserving |
300 |
the flat structure during the transitional period. |
301 |
|
302 |
The new clients will fetch the ``layout.conf`` file to avoid backwards |
303 |
compatibility concerns in the future. In case of hitting an old mirror, |
304 |
the package manager will default to the ``flat`` structure. |
305 |
|
306 |
|
307 |
Package manager storage compatibility |
308 |
------------------------------------- |
309 |
The exact means of preserving backwards compatibility in package manager |
310 |
storage are left to the package manager authors. However, it is |
311 |
recommended that package managers continue to support the flat layout |
312 |
even if it is no longer the default. The package manager may either |
313 |
continue to read files from this location or automatically move them |
314 |
to an appropriate subdirectory. |
315 |
|
316 |
|
317 |
Reference Implementation |
318 |
======================== |
319 |
TODO. |
320 |
|
321 |
|
322 |
References |
323 |
========== |
324 |
.. [#DESKTOP_FORMAT] Desktop Entry Specification: Basic format of the file |
325 |
(https://standards.freedesktop.org/desktop-entry-spec/latest/ar01s03.html) |
326 |
|
327 |
.. [#GLEP74] GLEP 74: Full-tree verification using Manifest files: |
328 |
Checksum algorithms (informational) |
329 |
(https://www.gentoo.org/glep/glep-0074.html#checksum-algorithms-informational) |
330 |
|
331 |
.. [#GLEP59] GLEP 59: Manifest2 hash policies and security implications |
332 |
(https://www.gentoo.org/glep/glep-0059.html) |
333 |
|
334 |
.. [#BUG534528] Bug 534528 - distfiles should be sorted into subdirectories |
335 |
of DISTDIR |
336 |
(https://bugs.gentoo.org/534528) |
337 |
|
338 |
|
339 |
Copyright |
340 |
========= |
341 |
This work is licensed under the Creative Commons Attribution-ShareAlike 3.0 |
342 |
Unported License. To view a copy of this license, visit |
343 |
http://creativecommons.org/licenses/by-sa/3.0/. |
344 |
|
345 |
-- |
346 |
Best regards, |
347 |
Michał Górny |