1 |
Hi Matt, |
2 |
|
3 |
On Tue, Apr 5, 2022 at 10:38 PM Matt Turner <mattst88@g.o> wrote: |
4 |
> |
5 |
> On Tue, Apr 5, 2022 at 12:30 PM Jason A. Donenfeld <zx2c4@g.o> wrote: |
6 |
> > By the way, we're not currently _checking_ two hash functions during |
7 |
> > src_prepare(), are we? |
8 |
> |
9 |
> I don't know, but the hash-checking is definitely checked before src_prepare(). |
10 |
|
11 |
Er, during the builtin fetch phase. Anyway, you know what I meant. :) |
12 |
|
13 |
Anyway, looking at the portage source code, to answer my own question, |
14 |
it looks like the file is actually being read twice and both hashes |
15 |
computed. I would have at least expected an optimization like: |
16 |
|
17 |
hash1_init(&hash1); |
18 |
hash2_init(&hash2); |
19 |
for chunks in file: |
20 |
hash1_update(&hash1, chunk); |
21 |
hash2_update(&hash2, chunk); |
22 |
hash1_final(&hash1, out1); |
23 |
hash2_final(&hash2, out2); |
24 |
|
25 |
But actually what's happening is the even less efficient: |
26 |
|
27 |
hash1_init(&hash1); |
28 |
for chunks in file: |
29 |
hash1_update(&hash1, chunk); |
30 |
hash1_final(&hash1, out1); |
31 |
hash2_init(&hash2); |
32 |
for chunks in file: |
33 |
hash2_update(&hash2, chunk); |
34 |
hash1_final(&hash2, out2); |
35 |
|
36 |
So the file winds up being open and read twice. For huge tarballs like |
37 |
chromium or libreoffice... |
38 |
|
39 |
But either way you do it - the missed optimization above or the |
40 |
unoptimized reality below - there's still twice as much work being |
41 |
done. This is all unless I've misread the source code, which is |
42 |
possible, so if somebody knows this code well and I'm wrong here, |
43 |
please do speak up. |
44 |
|
45 |
Jason |