Gentoo Archives: gentoo-portage-dev

From: "Michał Górny" <mgorny@g.o>
To: gentoo-portage-dev@l.g.o
Subject: Re: [gentoo-portage-dev] Speeding up Tree Verification
Date: Tue, 30 Jun 2020 19:29:43
Message-Id: 4f3cda241c77400a777382ae51693fc3a68f0b17.camel@gentoo.org
In Reply to: Re: [gentoo-portage-dev] Speeding up Tree Verification by Sid Spry
1 On Tue, 2020-06-30 at 12:50 -0500, Sid Spry wrote:
2 > On Tue, Jun 30, 2020, at 2:28 AM, Michał Górny wrote:
3 > > Dnia June 30, 2020 2:13:43 AM UTC, Sid Spry <sid@××××.us> napisał(a):
4 > > > Hello,
5 > > >
6 > > > I have some runnable pseudocode outlining a faster tree verification
7 > > > algorithm.
8 > > > Before I create patches I'd like to see if there is any guidance on
9 > > > making the
10 > > > changes as unobtrusive as possible. If the radical change in algorithm
11 > > > is
12 > > > acceptable I can work on adding the changes.
13 > > >
14 > > > Instead of composing any kind of structured data out of the portage
15 > > > tree my
16 > > > algorithm just lists all files and then optionally batches them out to
17 > > > threads.
18 > > > There is a noticeable speedup by eliding the tree traversal operations
19 > > > which
20 > > > can be seen when running the algorithm with a single thread and
21 > > > comparing it to
22 > > > the current algorithm in gemato (which should still be discussed
23 > > > here?).
24 > >
25 > > Without reading the code: does your algorithm correctly detect extraneous files?
26 > >
27 >
28 > Yes and no.
29 >
30 > I am not sure why this is necessary. If the file does not appear in a manifest it is
31 > ignored. It makes the most sense to me to put the burden of not including
32 > untracked files on the publisher. If the user puts an untracked file into the tree it
33 > will be ignored to no consequence; the authored files don't refer to it, after all.
34
35 This is necessary because a malicious third party can MITM you an rsync
36 tree with extraneous files (say, -r1 baselayout ebuild) that do horrible
37 things on your system. If you don't reject files not in Manifest, you
38 open a huge security hole.
39
40 > But it would be easy enough to build a second list of all files and compare it to
41 > the list of files built from the manifests. If there are extras an error can be
42 > generated. This is actually the first test I did on my manifest parsing code. I tried
43 > to see if my tracked files roughly matched the total files in tree. That can be
44 > repurposed for this check.
45 >
46 > > > Some simple tests like counting all objects traversed and verified
47 > > > returns the
48 > > > same(ish). Once it is put into portage it could be tested in detail.
49 > > >
50 > > > There is also my partial attempt at removing the brittle interface to
51 > > > GnuPG
52 > > > (it's not as if the current code is badly designed, just that parsing
53 > > > the
54 > > > output of GnuPG directly is likely not the best idea).
55 > >
56 > > The 'brittle interface' is well-defined machine-readable output.
57 > >
58 >
59 > Ok. I was aware there was a machine interface, but the classes that manipulate
60 > a temporary GPG home seemed like not the best solution. I guess that is all
61 > due to GPG assuming everything is in ~/.gnupg and keeping its state as a
62 > directory structure.
63
64 A temporary home directory guarantees that user configuration does not
65 affect the verification result.
66
67 >
68 > > > Needs gemato, dnspython, and requests. Slightly better than random code
69 > > > because
70 > > > I took inspiration from the existing gemato classes.
71 > >
72 > > The code makes a lot of brittle assumptions about the structure. The
73 > > GLEP was specifically designed to avoid that and let us adjust the
74 > > structure in the future to meet our needs.
75 > >
76 >
77 > These same assumptions are built into the code that operates on the
78 > tree structure. If the GLEP were changed the existing code would also
79 > potentially need changing. This code just uses the structure in a different
80 > way.
81 >
82
83 The code that predates the GLEP, yes. It will eventually be changed to
84 be more flexible, especially when we can assume that we start removing
85 backwards compatibility.
86
87 --
88 Best regards,
89 Michał Górny

Attachments

File name MIME type
signature.asc application/pgp-signature

Replies

Subject Author
Re: [gentoo-portage-dev] Speeding up Tree Verification Sid Spry <sid@××××.us>