Gentoo Archives: gentoo-amd64

From: Duncan <1i5t5.duncan@×××.net>
To: gentoo-amd64@l.g.o
Subject: [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
Date: Thu, 07 Aug 2014 21:19:14
Message-Id: pan$44bc4$51e91e56$38ec6dce$bcbb50bd@cox.net
In Reply to: Re: [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?) by Mark Knecht
1 Mark Knecht posted on Thu, 07 Aug 2014 11:16:23 -0700 as excerpted:
2
3 > So that's all looking pretty good, as a first step. If it's a matter of
4 > 3 1/2 minutes instead of 1-2 minutes then I can live with that part.
5 > However that's just (I think) the portage tree and not signed source
6 > code, correct?
7
8 [I just posted a reply to the gpg specific stuff.]
9
10 Technically correct, but not really so in implementation. See below...
11
12 > Now, is the idea that I have a validated portage snapshot at this point
13 > and stiff have to actually get the code using the regular emerge which
14 > will do the checking because I have:
15 >
16 > FEATURES="buildpkg strict webrsync-gpg"
17
18 No... It doesn't work that way.
19
20 > I don't see any evidence that emerge checked what it downloaded, but
21 > maybe those checks are only done when I really build the code?
22
23 Here's what happens.
24
25 FEATURES=webrsync-gpg simply tells the webrsync stuff to gpg-verify the
26 snapshot-tarball that webrsync downloads. Without that, it'd still
27 download it the same, but it wouldn't verify the signature. This allows
28 people who use the webrsync only because they're behind a firewall that
29 wouldn't allow normal rsync, but who don't care about the gpg signing
30 security stuff, to use the same tool as the people who actually use
31 webrsync for the security aspect, regardless of whether they could use
32 normal rsync or not.
33
34 So that gets you a signed and verified tree. Correct so far.
35
36 But as part of that tree, there are digest files for each package that
37 verify the integrity of the ebuild as well as of the sources tarballs
38 (distfiles).
39
40 Now it's important to grasp the difference between gpg signing and simple
41 hash digests, here.
42
43 Anybody with the appropriate tools (md5sum, for example, does md5 hashes,
44 but there's sha and other hashes as well, and the portage tree uses
45 several hash algorithms in case one is broken) can take a hash of a file,
46 and provided it's exactly the same bit-for-bit file they should get
47 exactly the same hash.
48
49 In fact, that's how portage checks the hashes of both the ebuild files
50 and the distfiles it uses, regardless of this webrsync-gpg stuff. The
51 tree ships the hash values that the gentoo package maintainer took of the
52 files in its digest files, and portage takes its own hash of the files
53 and compares it to the hash value stored in the digest files. If they
54 match, portage is happy. If they don't, depending on how strict you have
55 portage set to be (FEATURES=strict), it will either warn about (without
56 strict) or entirely refuse to merge that package (with strict), until
57 either the digest is updated, or a new file matching the old digest is
58 downloaded.
59
60 So far so good, but while the hashes protect against accidental damage as
61 the file was being downloaded, because anyone can take a hash of the
62 file, without something stronger, if say one of the mirror operators was
63 a bad guy, they could replace the files with hacked files and as long as
64 they replaced the digest files with the new ones they created for the
65 hacked files at the same time, portage wouldn't know.
66
67 So while hashes/digests alone protect quite well from accidental damage,
68 they can't protect, by themselves, from deliberate replacement of those
69 files with malware infested copies.
70
71 Which is where the gpg signed tree snapshots come in. But before we can
72 understand how they help, we need to understand how gpg signing differs
73 from simple hashes.
74
75 PGP, gpg, and various other public/private-pair key signing (and
76 encryption) take advantage of a particular mathematical relationship
77 property between the public and private keys. I'm not a cryptographer
78 nor a mathematician, so I'm content to leave it at that rather handwavy
79 assertion and not get into the details, but enough people I trust say the
80 same thing about the details, and enough of our modern Internet banking
81 and the like, depends upon the same idea, that I'm relatively confident
82 in the general principle, at least.
83
84 It works like this. People keep the private key from the pair private --
85 if it gets out, they've lost the secret. But people publish the public
86 half of the key. The relationship of the keys is such that people can't
87 figure out the private key from the public key, but if you have the
88 private key, you can sign stuff with it, and people with the public key
89 can verify the signature and thus trust that it really was the person
90 with that key that signed the content. Similarly, people can use the
91 public key to encrypt something, and only the person with the private key
92 will be able to decrypt it -- having the public key doesn't help.
93
94 Actually, as I understand it signing is simply a combination of hashing
95 and encryption, such that a hash of the content to be signed is taken,
96 and then that hash is encrypted with the private key. Now anyone with
97 the public key can "decrypt" the hash and verify the content with it,
98 thereby verifying that the private key used to sign the content by
99 encrypting the hash was the one used. If some other key had been used,
100 attempting to decrypt the hash with an unmatched public key would simply
101 produce gibberish, and the supposedly "decrypted" hash wouldn't be the
102 hash produced when checking the content, thereby failing to verify that
103 the signed content actually came from the person that it was claimed to
104 have come from.
105
106
107 OK, we've now established that hashes simply verify that the content
108 didn't get modified in transit, but they do NOT by themselves verify who
109 SENT that content, so indeed, a man-in-the-middle could have replaced
110 BOTH the content and the hash, and someone relying on just hashes
111 couldn't tell the difference.
112
113 And we've also established that a signature verifies that the content
114 actually came from the person who had the private key matching the public
115 key used to verify it, by mechanism of encrypting the hash of that
116 content with the private key, so only by "decrypting" it with the
117 matching public key, does the hash of the content match the one taken at
118 the other end and encrypted with the private key.
119
120 *NOW* we're equipped to see how the portage tree snapshot signing method
121 actually allows us to verify distfiles as well. Because the tree
122 includes digests that we can now verify came from our trusted source,
123 gentoo, NOW those digests can be used to verify the distfiles, because
124 the digests were part of the signed tree and nobody could tamper with
125 that signed tree including those digests without detection.
126
127 If our nefarious gentoo mirror operator tried to switch out the source
128 tarballs AND the digests, he could do so for normal rsync users, and for
129 webrsync users not doing gpg verification, without detection. But should
130 he try that with someone that's using webrsync-gpg, he has no way to sign
131 the tampered with tarball with the correct private key since he doesn't
132 have it, and those using webrsync with FEATURES=webrsync-gpg would detect
133 the tampered tarball as portage (via webrsync, via eix in your case)
134 would reject that tarball as unverified.
135
136 So the hash-digest method used to protect ordinary rsync users (and
137 webrsync users without webrsync-gpg turned on) from ACCIDENTAL damage,
138 now protects webrsync-gpg users from DELIBERATE man-in-the-middle attacks
139 as well, not because the digests themselves are different, but because we
140 can now trust and verify that they came from a legitimate source.
141
142 Tho it should be noted that "legitimate source" is defined as anyone
143 having access to that that private signing key. So should someone breakin
144 to the snapshotting server and steal that private key doing the signing,
145 they now become a "legitimate source" as far as webrsync-gpg is concerned.
146
147
148 So where does that leave us in practice?
149
150 Basically here:
151
152 You're now verifying that the snapshot tarballs are coming from a source
153 with the private signing key, and we're assuming that gentoo security
154 hasn't been broken and thus that only gentoo's snapshot signing servers
155 (and their admins, of course) have access to the private signing key,
156 which in turn means we're assuming the machine with that signing key must
157 be gentoo, and thus that the snapshotted tarballs are legit.
158
159 But it's actually webrsync in combination with FEATURES=webrsync-gpg
160 that's doing that verification.
161
162 Once the verified tarball is actually unpacked on our system, portage
163 operate just as it normally does, simply verifying the usual hash digests
164 against the ebuilds and the distfiles /exactly/ as it normally would.
165
166 Repeating in different words to hopefully ensure it's understood:
167
168 It's *ONLY* the fact that we have actually gpg-verified that snapshot
169 tarball and thus the digests within it, that gives us any more security
170 than an ordinary rsync user. After that's downloaded, verified and
171 unpacked, portage operates exactly as it normally does.
172
173
174 Meanwhile, part of that normal operation includes FEATURES=strict, if
175 you've set it, which causes portage to refuse to merge the package if
176 those digests don't match. But that part of things is just normal
177 portage operation. Rsync users get it too -- they just don't have the
178 additional assurance that those digest files actually came from gentoo
179 (or at least from someone with gentoo's private signing key), that
180 webrsync with FEATURES=webrsync-gpg provides.
181
182
183 (Meanwhile, one further personal note FWIW. You may think that all these
184 long explanations take quite some time to type up, and you'd be correct.
185 But don't make the mistake of thinking that I don't get a benefit from it
186 myself. My dad was a teacher, and one of the things he used to say that
187 I've found to be truer than true, is that the best way to /learn/
188 something is to try to teach it to someone. That's exactly what I'm
189 doing, and all the unexpected questions and corner cases that I'd have
190 never thought about on my own, that people bring up and force me to think
191 about in ordered to answer them, help me improve my own previously more
192 handwavy and fuzzy "general concept" understanding as well. I'm much
193 more confident in my own understanding of the general public/private key
194 concepts, how gpg actually uses them and how its web-of-trust works, and
195 more specifically, how portage can use that via webrsync-gpg to actually
196 improve the gentooer's own security, than I ever was before.
197
198 And it has been quite some time since I worked with gpg and saw it in
199 interactive mode like that, too, and it turns out that in the intervening
200 years, I've actually understood quite a bit more about how it all works
201 than I did back then, thus my ability to dig that all up and present it
202 here, while back a few years ago, I was just as clueless about how all
203 that web-of-trust stuff worked, and make exactly the same mistake of
204 "ultimately trusting" the distro's package-signing key, for exactly the
205 same reasons. Turns out I absorbed rather more from all those security
206 and encryption articles I've read over the years than I realized, but it
207 actually took my replies right here in this thread to lay it all out
208 logically so I too realized how much more I understand what's going on
209 now, than I did back then.)
210
211 So... Thanks for the thread! =:^)
212
213 --
214 Duncan - List replies preferred. No HTML msgs.
215 "Every nonfree program has a lord, a master --
216 and if you use the program, he is your master." Richard Stallman

Replies