1 |
On Tue, Jun 30, 2020, at 1:20 AM, Fabian Groffen wrote: |
2 |
> Hi, |
3 |
> |
4 |
> On 29-06-2020 21:13:43 -0500, Sid Spry wrote: |
5 |
> > Hello, |
6 |
> > |
7 |
> > I have some runnable pseudocode outlining a faster tree verification algorithm. |
8 |
> > Before I create patches I'd like to see if there is any guidance on making the |
9 |
> > changes as unobtrusive as possible. If the radical change in algorithm is |
10 |
> > acceptable I can work on adding the changes. |
11 |
> > |
12 |
> > Instead of composing any kind of structured data out of the portage tree my |
13 |
> > algorithm just lists all files and then optionally batches them out to threads. |
14 |
> > There is a noticeable speedup by eliding the tree traversal operations which |
15 |
> > can be seen when running the algorithm with a single thread and comparing it to |
16 |
> > the current algorithm in gemato (which should still be discussed here?). |
17 |
> |
18 |
> I remember something that gemato used to use multiple threads, but |
19 |
> because it totally saturated disk-IO, it was brought back to a single |
20 |
> thread. People were complaining about unusable systems. |
21 |
> |
22 |
|
23 |
I think this is an argument for cgroups limits support on the portage process or |
24 |
account as opposed to an argument against picking a better algorithm. That is |
25 |
something I have been working towards, but I am only one man. |
26 |
|
27 |
> In any case, can you share your performance results? What speedup did |
28 |
> you see, on warm and hot FS caches? Which type of disk do you use? |
29 |
> |
30 |
|
31 |
I ran all tests multiple times to make them warm off of a Samsung SSD, but |
32 |
nothing very precise yet. |
33 |
|
34 |
% gemato verify --openpgp-key signkey.asc /var/db/repos/gentoo |
35 |
[...] |
36 |
INFO:root:Verifying /var/db/repos/gentoo... |
37 |
INFO:root:/var/db/repos/gentoo verified in 16.45 seconds |
38 |
|
39 |
sometimes going higher, closer to 18s, vs. |
40 |
|
41 |
% ./veriftree.py |
42 |
4.763171965983929 |
43 |
|
44 |
So roughly an order of magnitude speedup without batching to threads. |
45 |
|
46 |
> You could compare against qmanifest, which uses OpenMP-based |
47 |
> paralllelism while verifying the tree. On SSDs this does help. |
48 |
> |
49 |
|
50 |
I lost my notes -- how do I specify to either gemato or qmanifest the GnuPG |
51 |
directory? My code is partially structured as it is because I had problems doing |
52 |
this. I rediscovered -K/--openpgp-key in gemato but am unsure for qmanifest. |