Gentoo Archives: gentoo-dev

From: "Robin H. Johnson" <robbat2@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] [v1.0.3] GLEP 74: Full-tree verification using Manifest files
Date: Thu, 02 Nov 2017 23:43:18
Message-Id: robbat2-20171102T223753-931684886Z@orbis-terrarum.net
In Reply to: Re: [gentoo-dev] [v1.0.3] GLEP 74: Full-tree verification using Manifest files by "Michał Górny"
1 On Thu, Nov 02, 2017 at 08:11:59PM +0100, Michał Górny wrote:
2 > Next version. Now without MISC/OPTIONAL, and with many clarifications.
3 Huge improvements in this version, I found it much easier to understand.
4
5 Nits:
6 - please stick to ASCII ellipsis. The unicode ellipsis is unreadable in
7 some monospace fonts.
8
9 Further items inline:
10 > Directory tree coverage
11 > -----------------------
12 ...
13 > The file entries (except for ``IGNORE``) can be specified for regular
14 > files only. Symbolic links are followed when opening files
15 > and traversing directories. It is an error to specify an entry for
16 > a different file type. If the tree contain files of other types
17 > that are not otherwise ignored, they need to be covered by an explicit
18 > ``IGNORE``.
19 >
20 > All the local (non-``DIST``) files covered by a Manifest tree must
21 > reside on the same filesystem. It is an error to specify entries
22 > applying to files on another filesystem. If subdirectories
23 > that are not otherwise ignored reside on a different filesystem, they
24 > must be explicitly excluded via ``IGNORE``.
25 I would prefer this to say:
26 'If files that are not otherwise ignored reside on a different
27 filesystem', as expanded from sub-directories.
28 This implicitly forbids following a symlink that crosses a filesystem
29 boundary, and then matches the similar part of 'Tree layout
30 restrictions'.
31
32 > Rationale
33 > =========
34 ...
35 > Tree layout restrictions
36 > ------------------------
37 >
38 > The algorithm is meant to work primarily with ebuild repositories which
39 > normally contain only files and directories. Directories provide
40 > no useful metadata for verification, and specifying special entries
41 > for additional file types is purposeless. Therefore, the specification
42 > is restricted to dealing with regular files.
43 >
44 > The Gentoo repository does not use symbolic links. Some Gentoo
45 > repositories do, however. To provide a simple solution for dealing with
46 > symlinks without having to take care to implement special handling for
47 > them, the common behavior of implicitly resolving them is used.
48 > Therefore, symbolic links to files are stored as if they were regular
49 > files, and symbolic links to directories are followed as if they were
50 > regular directories.
51 >
52 > Dotfiles are implicitly ignored as that is a common notion used
53 > in software written for POSIX systems. All other common filenames
54 > require explicit ``IGNORE`` lines.
55 'common' in the second sentence seems odd. What about uncommon
56 filenames? Maybe just s/other common filenames/other filenames/.
57
58 > An ability to inject additional ignore entries is provided to account
59 > for site configuration affecting the repository tree — placing
60 > additional files in it, skipping some of the categories from syncing.
61 Mention that the package manager may provide wildcards or regex in the
62 additional entries. Eg: 'IGNORE **/metadata.xml'
63
64 > Non-strict Manifest verification
65 > --------------------------------
66 ...
67 > The cases for stripping unnecessary files mostly focused around space
68 > savings. For this purpose, stripping ``metadata.xml`` and similar files
69 > has little value. It is much more common for users to strip whole
70 > categories which can not be handled via the ``MISC`` type, and needs
71 > a dedicated package manager mechanism. The same mechanism can also
72 > handle files that used the ``MISC`` type.
73 Exclusion by package does happen as well. A list of categories or
74 packages can be used for both the rsync exclusion and the IGNORE.
75
76 > Splitting distfile checksums from file checksums
77 > ------------------------------------------------
78 >
79 > Another problem with the current Manifest format is that the checksums
80 > for fetched files are combined with checksums for local files
81 > in a single file inside the package directory. It has been specifically
82 > pointed out that:
83 >
84 > - since distfiles are sometimes reused across different packages,
85 > the repeating checksums are redundant,
86 Comment: 8.4% of all DIST entries are duplicate, representing a 2MiB
87 saving in tree size (25MiB of DIST entries altogether).
88
89 > - mirror admins were interested in the possibility of verifying all
90 > the distfiles with a single tool.
91 >
92 > This specification does not provide a clean solution to this problem.
93 > It technically permits moving ``DIST`` entries to higher-level Manifests
94 > but the usefulness of such a solution is doubtful.
95 This solution would require the packager manager to consider
96 higher-level Manifests or all Manifests in the tree when searching for
97 the DIST entry. The most useful implementation of this would be for the
98 git->rsync process to move all DIST entries elsewhere (metadata/ maybe).
99
100 Either way, this would have many downsides, and make manual work on the
101 Manifest DIST entries painful.
102
103 --
104 Robin Hugh Johnson
105 Gentoo Linux: Dev, Infra Lead, Foundation Asst. Treasurer
106 E-Mail : robbat2@g.o
107 GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
108 GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136

Attachments

File name MIME type
signature.asc application/pgp-signature

Replies