1 |
Hello, everyone. |
2 |
|
3 |
TL;DR: shortly, distfiles will need to be present under two paths for |
4 |
the transitional period. Would you prefer us using hardlinks or |
5 |
symlinks for that? |
6 |
|
7 |
|
8 |
We're planning to start deploying a new GLEP 75-based [1] mirror layout |
9 |
to our mirrors soonish. This implies a transitional period during which |
10 |
we'll be using both old and new layouts, so all file entries will be |
11 |
duplicated. The plan is roughly to: |
12 |
|
13 |
1. Enable new split layout in emirrordist, and start using both |
14 |
simultaneously for newly-mirrored files. |
15 |
|
16 |
2. Duplicate the existing distfiles to new layout. |
17 |
|
18 |
3. Live with both layouts for some longish time, to support people using |
19 |
old Portage versions. |
20 |
|
21 |
4. Eventually disable the old (flat) layout and start removing files. |
22 |
|
23 |
|
24 |
The basic problem is whether to use hardlinks or symlinks |
25 |
for the duplicate files. I've elaborate more on both solutions in [2] |
26 |
but I'll summarize shortly here. |
27 |
|
28 |
Hardlinks have the advantage that for mirrors enabling -H, they avoid |
29 |
extra space usage and extra traffic. However, we don't really know how |
30 |
many mirrors enable that, and I suspect it's around half of them. |
31 |
At initial deployment time, rsync will just hardlink files in new layout |
32 |
to existing entries, and at cleanup time it will just unlink old |
33 |
entries. |
34 |
|
35 |
For mirrors not enabling -H, hardlinks will mean all distfiles being |
36 |
transferred again during deployment time. Furthermore, through all |
37 |
transitional period all files will be duplicated, and so duplicated will |
38 |
be space usage. Cleanup should be lightweight though. |
39 |
|
40 |
Symlinks have the advantage that we know that all or almost all mirrors |
41 |
enable them. They are lightweight at deployment time since it's just |
42 |
a matter of rsync copying symlinks, and they definitely won't cause |
43 |
double space usage. However, they will cause all files being |
44 |
retransferred at cleanup time -- due to symlinks being replaced by real |
45 |
files. |
46 |
|
47 |
Technically, I suppose we could avoid that by splitting that into two |
48 |
stages, repeated for smaller groups of files. Firstly, replace symlinks |
49 |
with hardlinks which will make it light for at least some of the errors. |
50 |
Then, remove old files and jump over to the next group. For mirrors not |
51 |
using -H, this will still mean double transfer but we'd limit double |
52 |
space usage to one group at a time, and only for a short period. |
53 |
|
54 |
If any mirrors sync over rsync without using -l (talking about private |
55 |
mirrors here), they will not get the new layout at all which is going to |
56 |
suck for their users. |
57 |
|
58 |
|
59 |
Which way do you prefer? |
60 |
|
61 |
|
62 |
[1] https://www.gentoo.org/glep/glep-0075.html |
63 |
[2] https://bugs.gentoo.org/534528#c38 |
64 |
|
65 |
-- |
66 |
Best regards, |
67 |
Michał Górny |