1 |
W dniu czw, 02.11.2017 o godzinie 23∶43 +0000, użytkownik Robin H. |
2 |
Johnson napisał: |
3 |
> On Thu, Nov 02, 2017 at 08:11:59PM +0100, Michał Górny wrote: |
4 |
> > Next version. Now without MISC/OPTIONAL, and with many clarifications. |
5 |
> |
6 |
> Huge improvements in this version, I found it much easier to understand. |
7 |
> |
8 |
> Nits: |
9 |
> - please stick to ASCII ellipsis. The unicode ellipsis is unreadable in |
10 |
> some monospace fonts. |
11 |
|
12 |
Done. Also replaced '—' for consistency. |
13 |
|
14 |
> |
15 |
> Further items inline: |
16 |
> > Directory tree coverage |
17 |
> > ----------------------- |
18 |
> |
19 |
> ... |
20 |
> > The file entries (except for ``IGNORE``) can be specified for regular |
21 |
> > files only. Symbolic links are followed when opening files |
22 |
> > and traversing directories. It is an error to specify an entry for |
23 |
> > a different file type. If the tree contain files of other types |
24 |
> > that are not otherwise ignored, they need to be covered by an explicit |
25 |
> > ``IGNORE``. |
26 |
> > |
27 |
> > All the local (non-``DIST``) files covered by a Manifest tree must |
28 |
> > reside on the same filesystem. It is an error to specify entries |
29 |
> > applying to files on another filesystem. If subdirectories |
30 |
> > that are not otherwise ignored reside on a different filesystem, they |
31 |
> > must be explicitly excluded via ``IGNORE``. |
32 |
> |
33 |
> I would prefer this to say: |
34 |
> 'If files that are not otherwise ignored reside on a different |
35 |
> filesystem', as expanded from sub-directories. |
36 |
> This implicitly forbids following a symlink that crosses a filesystem |
37 |
> boundary, and then matches the similar part of 'Tree layout |
38 |
> restrictions'. |
39 |
|
40 |
I've went for something even more explicit: |
41 |
|
42 |
| If files or directories that are not otherwise ignored reside |
43 |
| on a different filesystem, or symbolic links point to targets |
44 |
| on a different filesystem, they must be explicitly excluded |
45 |
| via ``IGNORE``. |
46 |
|
47 |
|
48 |
> |
49 |
> > Rationale |
50 |
> > ========= |
51 |
> |
52 |
> ... |
53 |
> > Tree layout restrictions |
54 |
> > ------------------------ |
55 |
> > |
56 |
> > The algorithm is meant to work primarily with ebuild repositories which |
57 |
> > normally contain only files and directories. Directories provide |
58 |
> > no useful metadata for verification, and specifying special entries |
59 |
> > for additional file types is purposeless. Therefore, the specification |
60 |
> > is restricted to dealing with regular files. |
61 |
> > |
62 |
> > The Gentoo repository does not use symbolic links. Some Gentoo |
63 |
> > repositories do, however. To provide a simple solution for dealing with |
64 |
> > symlinks without having to take care to implement special handling for |
65 |
> > them, the common behavior of implicitly resolving them is used. |
66 |
> > Therefore, symbolic links to files are stored as if they were regular |
67 |
> > files, and symbolic links to directories are followed as if they were |
68 |
> > regular directories. |
69 |
> > |
70 |
> > Dotfiles are implicitly ignored as that is a common notion used |
71 |
> > in software written for POSIX systems. All other common filenames |
72 |
> > require explicit ``IGNORE`` lines. |
73 |
> |
74 |
> 'common' in the second sentence seems odd. What about uncommon |
75 |
> filenames? Maybe just s/other common filenames/other filenames/. |
76 |
|
77 |
Done. The idea was to say 'do not put IGNORE for corner cases which are |
78 |
better handled via PM config' but I guess it's not necessary here. |
79 |
|
80 |
> |
81 |
> > An ability to inject additional ignore entries is provided to account |
82 |
> > for site configuration affecting the repository tree — placing |
83 |
> > additional files in it, skipping some of the categories from syncing. |
84 |
> |
85 |
> Mention that the package manager may provide wildcards or regex in the |
86 |
> additional entries. Eg: 'IGNORE **/metadata.xml' |
87 |
|
88 |
Done. |
89 |
|
90 |
| This configuration can extend beyond the limits of this GLEP, |
91 |
| e.g. by allowing wildcards or regular expressions. |
92 |
|
93 |
> |
94 |
> > Non-strict Manifest verification |
95 |
> > -------------------------------- |
96 |
> |
97 |
> ... |
98 |
> > The cases for stripping unnecessary files mostly focused around space |
99 |
> > savings. For this purpose, stripping ``metadata.xml`` and similar files |
100 |
> > has little value. It is much more common for users to strip whole |
101 |
> > categories which can not be handled via the ``MISC`` type, and needs |
102 |
> > a dedicated package manager mechanism. The same mechanism can also |
103 |
> > handle files that used the ``MISC`` type. |
104 |
> |
105 |
> Exclusion by package does happen as well. A list of categories or |
106 |
> packages can be used for both the rsync exclusion and the IGNORE. |
107 |
|
108 |
Rewritten to: |
109 |
|
110 |
| It is much more common for users to strip whole packages |
111 |
| or categories. The ``MISC`` type is not suitable for that, |
112 |
| and so a dedicated package manager mechanism needs to be developed |
113 |
| instead; possibly combining it with rsync exclusion list. The same |
114 |
| mechanism can also handle files that historically used the ``MISC`` |
115 |
| type. |
116 |
|
117 |
But it's merely a rationale, so I'd rather not spend another hour trying |
118 |
to cover every corner case in it. |
119 |
|
120 |
> |
121 |
> > Splitting distfile checksums from file checksums |
122 |
> > ------------------------------------------------ |
123 |
> > |
124 |
> > Another problem with the current Manifest format is that the checksums |
125 |
> > for fetched files are combined with checksums for local files |
126 |
> > in a single file inside the package directory. It has been specifically |
127 |
> > pointed out that: |
128 |
> > |
129 |
> > - since distfiles are sometimes reused across different packages, |
130 |
> > the repeating checksums are redundant, |
131 |
> |
132 |
> Comment: 8.4% of all DIST entries are duplicate, representing a 2MiB |
133 |
> saving in tree size (25MiB of DIST entries altogether). |
134 |
|
135 |
Included as footnote: |
136 |
|
137 |
.. [#DIST] According to Robin H. Johnson, 8.4% of all DIST entries |
138 |
at the time of writing are duplicate, representing a 2 MiB |
139 |
out of 25 MiB of DIST entries altogether. |
140 |
|
141 |
> |
142 |
> > - mirror admins were interested in the possibility of verifying all |
143 |
> > the distfiles with a single tool. |
144 |
> > |
145 |
> > This specification does not provide a clean solution to this problem. |
146 |
> > It technically permits moving ``DIST`` entries to higher-level Manifests |
147 |
> > but the usefulness of such a solution is doubtful. |
148 |
> |
149 |
> This solution would require the packager manager to consider |
150 |
> higher-level Manifests or all Manifests in the tree when searching for |
151 |
> the DIST entry. The most useful implementation of this would be for the |
152 |
> git->rsync process to move all DIST entries elsewhere (metadata/ maybe). |
153 |
|
154 |
Technically speaking, the package manager needs to consider parent |
155 |
Manifests anyway in order to verify the deeper Manifests, and I think we |
156 |
can reasonably assume it will keep them cached. |
157 |
|
158 |
> |
159 |
> Either way, this would have many downsides, and make manual work on the |
160 |
> Manifest DIST entries painful. |
161 |
|
162 |
That's what 'doubtful usefulness' means ;-P. |
163 |
|
164 |
-- |
165 |
Best regards, |
166 |
Michał Górny |