Gentoo Archives: gentoo-dev

From: "Michał Górny" <mgorny@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] Manifest2 hashes, take n+1-th: 3 hashes for the tie-breaker case
Date: Mon, 13 Nov 2017 07:37:17
Message-Id: 1510558627.1239.5.camel@gentoo.org
In Reply to: Re: [gentoo-dev] Manifest2 hashes, take n+1-th: 3 hashes for the tie-breaker case by Joshua Kinard
1 W dniu nie, 12.11.2017 o godzinie 21∶22 -0500, użytkownik Joshua Kinard
2 napisał:
3 > On 10/24/2017 00:11, Michał Górny wrote:
4 > > W dniu wto, 24.10.2017 o godzinie 06∶04 +0200, użytkownik Michał Górny
5 > > napisał:
6 >
7 > [snip]
8 >
9 > > > > [BOBO06] is relevant research here, I cited it in the work that went into
10 > > > > GLEP59, the last time we updated the hashes. The less-technical explanation of it is:
11 > > > > "If you can express the output of H1(x)H2(x) in LESS bits than the combined
12 > > > > output size of H1,H2, then the attacks get a little bit easier"
13 > > > >
14 > > > > Some important pieces from it:
15 > > > > [J04] "showed that the concatenation of two Merkle-Damgard functions is not
16 > > > > much more secure than the individual functions.", but this holds ONLY if
17 > > > > the hash functions chosen are of the same construction (MD).
18 > > > > Choosing hashes with different constructions (Merkle-Damgard, HAIFA,
19 > > > > Sponge) is important, and given a black box environment,
20 > > > >
21 > > > > The original mail reached the same approximate decision, just to look
22 > > > > for diverse hashes, but decided that 2 was enough.
23 > > > >
24 > > > > Q: What are the odds of a simultaneous successful attack against two hashes?
25 > > > > A: IDK, but if the hash functions are truly independent, it must be provably
26 > > > > lower than the odds of an attack against a single hash.
27 > > >
28 > > > We're talking about really huge (→∞) numbers here. It's not a 'random'
29 > > > attack against one hash. It's an attack that allows to sneak a malicious
30 > > > code with unchanged file size (since we store that too), and no apparent
31 > > > side effects (what's the point of spending up that much resources
32 > > > if the user is going to notice?).
33 > > >
34 > > > > Q: What's the big difference between a bug and a successful attack in a hash?
35 > > > > A: Bugs are more likely initially, and attacks come later.
36 > > >
37 > > > Sounds like an entirely abstract point in time ;-).
38 > > >
39 > > > > All of that said, is there really a significant long-term gain in
40 > > > > multiple hashes? (setting aside the short-term advantage in a transition
41 > > > > period for changing hashes)
42 > > >
43 > > > No, and that's my point. One hash is perfectly fine.
44 > > >
45 > > > Two hashes are useful for transition purposes. If we take two fast
46 > > > hashes (e.g. proposed SHA512 + BLAKE2B which have similar speed),
47 > > > we can use 2 threads to prevent the speed loss (except for old single-
48 > > > core machines).
49 >
50 > Minor clarification, old single core //and// uni-processor. Some older
51 > machines have multiple physical CPUs that are single-core. Threading should be
52 > okay on these, as long as the thread count stays under NR_CPUS.
53 >
54 > I also have a really old single-CPU system, MIPS (obviously). Not the fastest
55 > on the block compared to the other equipment I've got, but does anyone know of
56 > any simple timing scripts/programs available that can benchmark some of these
57 > proposed digest hashes? If they turn out to be reasonably quick on my old
58 > machine, I doubt then that speed will be too much of an issue.
59
60 You could play with utils/benchmark.py inside gemato [1]. Note that it's
61 not very precise though but should give a rough measurement. Also note
62 that it is suited for one big file while we mostly deal with a lot of
63 small files and that changes things a bit.
64
65 [1]:https://github.com/mgorny/gemato
66
67 > Also, for whatever hashes we ultimately go with, what are considerations for
68 > the userland support for them on non-glibc systems? E.g., are they provided by
69 > third-party libraries or do they need implementations in
70 > uclibc/uclibc-ng/musl/*? And what about the Alt/BSD side of things? I assume
71 > FreeBSD implements this already, but worth verifying with all of the
72 > combinations of arches/libc's/kernels and whatnot. I mean, there still might
73 > be a lonely m68k install out there...
74
75 We've selected the hashes that are guaranteed to be included in CPython
76 3.6+. For older versions of Python, we are using the Python extension
77 based on the reference implementation (just like the code in CPython)
78 pyblake2.
79
80 --
81 Best regards,
82 Michał Górny