1 |
On Thu, Nov 02, 2017 at 08:11:59PM +0100, Michał Górny wrote: |
2 |
> Next version. Now without MISC/OPTIONAL, and with many clarifications. |
3 |
Huge improvements in this version, I found it much easier to understand. |
4 |
|
5 |
Nits: |
6 |
- please stick to ASCII ellipsis. The unicode ellipsis is unreadable in |
7 |
some monospace fonts. |
8 |
|
9 |
Further items inline: |
10 |
> Directory tree coverage |
11 |
> ----------------------- |
12 |
... |
13 |
> The file entries (except for ``IGNORE``) can be specified for regular |
14 |
> files only. Symbolic links are followed when opening files |
15 |
> and traversing directories. It is an error to specify an entry for |
16 |
> a different file type. If the tree contain files of other types |
17 |
> that are not otherwise ignored, they need to be covered by an explicit |
18 |
> ``IGNORE``. |
19 |
> |
20 |
> All the local (non-``DIST``) files covered by a Manifest tree must |
21 |
> reside on the same filesystem. It is an error to specify entries |
22 |
> applying to files on another filesystem. If subdirectories |
23 |
> that are not otherwise ignored reside on a different filesystem, they |
24 |
> must be explicitly excluded via ``IGNORE``. |
25 |
I would prefer this to say: |
26 |
'If files that are not otherwise ignored reside on a different |
27 |
filesystem', as expanded from sub-directories. |
28 |
This implicitly forbids following a symlink that crosses a filesystem |
29 |
boundary, and then matches the similar part of 'Tree layout |
30 |
restrictions'. |
31 |
|
32 |
> Rationale |
33 |
> ========= |
34 |
... |
35 |
> Tree layout restrictions |
36 |
> ------------------------ |
37 |
> |
38 |
> The algorithm is meant to work primarily with ebuild repositories which |
39 |
> normally contain only files and directories. Directories provide |
40 |
> no useful metadata for verification, and specifying special entries |
41 |
> for additional file types is purposeless. Therefore, the specification |
42 |
> is restricted to dealing with regular files. |
43 |
> |
44 |
> The Gentoo repository does not use symbolic links. Some Gentoo |
45 |
> repositories do, however. To provide a simple solution for dealing with |
46 |
> symlinks without having to take care to implement special handling for |
47 |
> them, the common behavior of implicitly resolving them is used. |
48 |
> Therefore, symbolic links to files are stored as if they were regular |
49 |
> files, and symbolic links to directories are followed as if they were |
50 |
> regular directories. |
51 |
> |
52 |
> Dotfiles are implicitly ignored as that is a common notion used |
53 |
> in software written for POSIX systems. All other common filenames |
54 |
> require explicit ``IGNORE`` lines. |
55 |
'common' in the second sentence seems odd. What about uncommon |
56 |
filenames? Maybe just s/other common filenames/other filenames/. |
57 |
|
58 |
> An ability to inject additional ignore entries is provided to account |
59 |
> for site configuration affecting the repository tree — placing |
60 |
> additional files in it, skipping some of the categories from syncing. |
61 |
Mention that the package manager may provide wildcards or regex in the |
62 |
additional entries. Eg: 'IGNORE **/metadata.xml' |
63 |
|
64 |
> Non-strict Manifest verification |
65 |
> -------------------------------- |
66 |
... |
67 |
> The cases for stripping unnecessary files mostly focused around space |
68 |
> savings. For this purpose, stripping ``metadata.xml`` and similar files |
69 |
> has little value. It is much more common for users to strip whole |
70 |
> categories which can not be handled via the ``MISC`` type, and needs |
71 |
> a dedicated package manager mechanism. The same mechanism can also |
72 |
> handle files that used the ``MISC`` type. |
73 |
Exclusion by package does happen as well. A list of categories or |
74 |
packages can be used for both the rsync exclusion and the IGNORE. |
75 |
|
76 |
> Splitting distfile checksums from file checksums |
77 |
> ------------------------------------------------ |
78 |
> |
79 |
> Another problem with the current Manifest format is that the checksums |
80 |
> for fetched files are combined with checksums for local files |
81 |
> in a single file inside the package directory. It has been specifically |
82 |
> pointed out that: |
83 |
> |
84 |
> - since distfiles are sometimes reused across different packages, |
85 |
> the repeating checksums are redundant, |
86 |
Comment: 8.4% of all DIST entries are duplicate, representing a 2MiB |
87 |
saving in tree size (25MiB of DIST entries altogether). |
88 |
|
89 |
> - mirror admins were interested in the possibility of verifying all |
90 |
> the distfiles with a single tool. |
91 |
> |
92 |
> This specification does not provide a clean solution to this problem. |
93 |
> It technically permits moving ``DIST`` entries to higher-level Manifests |
94 |
> but the usefulness of such a solution is doubtful. |
95 |
This solution would require the packager manager to consider |
96 |
higher-level Manifests or all Manifests in the tree when searching for |
97 |
the DIST entry. The most useful implementation of this would be for the |
98 |
git->rsync process to move all DIST entries elsewhere (metadata/ maybe). |
99 |
|
100 |
Either way, this would have many downsides, and make manual work on the |
101 |
Manifest DIST entries painful. |
102 |
|
103 |
-- |
104 |
Robin Hugh Johnson |
105 |
Gentoo Linux: Dev, Infra Lead, Foundation Asst. Treasurer |
106 |
E-Mail : robbat2@g.o |
107 |
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 |
108 |
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136 |