Gentoo Archives: gentoo-amd64

From: Mark Knecht <markknecht@×××××.com>
To: Gentoo AMD64 <gentoo-amd64@l.g.o>
Subject: Re: [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
Date: Fri, 08 Aug 2014 18:34:58
Message-Id: CAK2H+efbnkKs=s=1753xFw52zaqrBZk9V1TbZVcq_uMs-Ze-Tw@mail.gmail.com
In Reply to: [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?) by Duncan <1i5t5.duncan@cox.net>
1 Hi Duncan,
2 Responding to one thing here, the rest in-line:
3
4 [QUOTE]
5 (Meanwhile, one further personal note FWIW. You may think that all these
6 long explanations take quite some time to type up, and you'd be correct.
7 But don't make the mistake of thinking that I don't get a benefit from it
8 myself. My dad was a teacher, and one of the things he used to say that
9 I've found to be truer than true, is that the best way to /learn/
10 something is to try to teach it to someone.
11 [/QUOTE]
12
13 I couldn't agree more and appreciate your efforts. And even if I might
14 already understand some of what you document I'm sure there are
15 others that come later looking for answers who get lots from these
16 conversations, solve problems and we never hear about it. Anyway,
17 a big thanks.
18
19 On Thu, Aug 7, 2014 at 2:18 PM, Duncan <1i5t5.duncan@×××.net> wrote:
20 > Mark Knecht posted on Thu, 07 Aug 2014 11:16:23 -0700 as excerpted:
21 >
22 >> So that's all looking pretty good, as a first step. If it's a matter of
23 >> 3 1/2 minutes instead of 1-2 minutes then I can live with that part.
24 >> However that's just (I think) the portage tree and not signed source
25 >> code, correct?
26 >
27 > [I just posted a reply to the gpg specific stuff.]
28 >
29 > Technically correct, but not really so in implementation. See below...
30 >
31 >> Now, is the idea that I have a validated portage snapshot at this point
32 >> and stiff have to actually get the code using the regular emerge which
33 >> will do the checking because I have:
34 >>
35 >> FEATURES="buildpkg strict webrsync-gpg"
36 >
37 > No... It doesn't work that way.
38 >
39 >> I don't see any evidence that emerge checked what it downloaded, but
40 >> maybe those checks are only done when I really build the code?
41 >
42 > Here's what happens.
43 >
44 > FEATURES=webrsync-gpg simply tells the webrsync stuff to gpg-verify the
45 > snapshot-tarball that webrsync downloads. Without that, it'd still
46 > download it the same, but it wouldn't verify the signature. This allows
47 > people who use the webrsync only because they're behind a firewall that
48 > wouldn't allow normal rsync, but who don't care about the gpg signing
49 > security stuff, to use the same tool as the people who actually use
50 > webrsync for the security aspect, regardless of whether they could use
51 > normal rsync or not.
52 >
53
54 And to clarify, I believe this step is responsible for putting into place on
55 a Gentoo machine much of what's in /usr/portage, most specifically in the
56 app categorization directories. In the old days the Gentoo Install Guide
57 used to have us download the portage snapshots for a location such as
58
59 http://distfiles.gentoo.org/snapshots/
60
61 That's now been replaced by a call to emerge-webrsync so newbies
62 might not have that view. Additionally, even if we're downloading the
63 snapshot tarball it appears, at least on my system, it's deleted after
64 it's expanded/ Or at least it's not showing up in a locate command.
65
66
67 > So that gets you a signed and verified tree. Correct so far.
68 >
69 > But as part of that tree, there are digest files for each package that
70 > verify the integrity of the ebuild as well as of the sources tarballs
71 > (distfiles).
72 >
73
74 Yep.
75
76 > Now it's important to grasp the difference between gpg signing and simple
77 > hash digests, here.
78 >
79 > Anybody with the appropriate tools (md5sum, for example, does md5 hashes,
80 > but there's sha and other hashes as well, and the portage tree uses
81 > several hash algorithms in case one is broken) can take a hash of a file,
82 > and provided it's exactly the same bit-for-bit file they should get
83 > exactly the same hash.
84 >
85 > In fact, that's how portage checks the hashes of both the ebuild files
86 > and the distfiles it uses, regardless of this webrsync-gpg stuff. The
87 > tree ships the hash values that the gentoo package maintainer took of the
88 > files in its digest files, and portage takes its own hash of the files
89 > and compares it to the hash value stored in the digest files. If they
90 > match, portage is happy. If they don't, depending on how strict you have
91 > portage set to be (FEATURES=strict), it will either warn about (without
92 > strict) or entirely refuse to merge that package (with strict), until
93 > either the digest is updated, or a new file matching the old digest is
94 > downloaded.
95 >
96 > So far so good, but while the hashes protect against accidental damage as
97 > the file was being downloaded, because anyone can take a hash of the
98 > file, without something stronger, if say one of the mirror operators was
99 > a bad guy, they could replace the files with hacked files and as long as
100 > they replaced the digest files with the new ones they created for the
101 > hacked files at the same time, portage wouldn't know.
102 >
103 > So while hashes/digests alone protect quite well from accidental damage,
104 > they can't protect, by themselves, from deliberate replacement of those
105 > files with malware infested copies.
106 >
107 > Which is where the gpg signed tree snapshots come in. But before we can
108 > understand how they help, we need to understand how gpg signing differs
109 > from simple hashes.
110 >
111
112 Some years ago (1997/98) I purchased one of Bruce Schneier's books - looking
113 at Amazon I recollect "Applied Cryptography: Protocols, Algorithms, and
114 Source Code in C" - so I've been through a lot of this in the area of
115 semiconductor
116 design. (5C Encryption model for 'protecting' movie content. What a joke...)
117
118 > PGP, gpg, and various other public/private-pair key signing (and
119 > encryption) take advantage of a particular mathematical relationship
120 > property between the public and private keys. I'm not a cryptographer
121 > nor a mathematician, so I'm content to leave it at that rather handwavy
122 > assertion and not get into the details, but enough people I trust say the
123 > same thing about the details, and enough of our modern Internet banking
124 > and the like, depends upon the same idea, that I'm relatively confident
125 > in the general principle, at least.
126 >
127 > It works like this. People keep the private key from the pair private --
128 > if it gets out, they've lost the secret. But people publish the public
129 > half of the key. The relationship of the keys is such that people can't
130 > figure out the private key from the public key, but if you have the
131 > private key, you can sign stuff with it, and people with the public key
132 > can verify the signature and thus trust that it really was the person
133 > with that key that signed the content. Similarly, people can use the
134 > public key to encrypt something, and only the person with the private key
135 > will be able to decrypt it -- having the public key doesn't help.
136 >
137 > Actually, as I understand it signing is simply a combination of hashing
138 > and encryption, such that a hash of the content to be signed is taken,
139 > and then that hash is encrypted with the private key. Now anyone with
140 > the public key can "decrypt" the hash and verify the content with it,
141 > thereby verifying that the private key used to sign the content by
142 > encrypting the hash was the one used. If some other key had been used,
143 > attempting to decrypt the hash with an unmatched public key would simply
144 > produce gibberish, and the supposedly "decrypted" hash wouldn't be the
145 > hash produced when checking the content, thereby failing to verify that
146 > the signed content actually came from the person that it was claimed to
147 > have come from.
148 >
149
150 If I recall correctly the flow looks like:
151
152 File -> (Sender Private/Receiver Public) -> Encrypted File
153
154 Encrypted File -> (Sender Public/Receiver Private) -> File
155
156 and this should be safe, albeit Rich's comment early on was
157
158 "3. Have an army of the best cryptographers in the world, etc."
159
160 coupled with lots of compute power leaves me with little doubt it's
161 not a 100% thing...
162
163 >
164 > OK, we've now established that hashes simply verify that the content
165 > didn't get modified in transit, but they do NOT by themselves verify who
166 > SENT that content, so indeed, a man-in-the-middle could have replaced
167 > BOTH the content and the hash, and someone relying on just hashes
168 > couldn't tell the difference.
169 >
170 > And we've also established that a signature verifies that the content
171 > actually came from the person who had the private key matching the public
172 > key used to verify it, by mechanism of encrypting the hash of that
173 > content with the private key, so only by "decrypting" it with the
174 > matching public key, does the hash of the content match the one taken at
175 > the other end and encrypted with the private key.
176 >
177 > *NOW* we're equipped to see how the portage tree snapshot signing method
178 > actually allows us to verify distfiles as well. Because the tree
179 > includes digests that we can now verify came from our trusted source,
180 > gentoo, NOW those digests can be used to verify the distfiles, because
181 > the digests were part of the signed tree and nobody could tamper with
182 > that signed tree including those digests without detection.
183 >
184
185 Correct. Hashes for all that stuff is in the Manifest files and I don't create
186 my own Manifests ever.
187
188 > If our nefarious gentoo mirror operator tried to switch out the source
189 > tarballs AND the digests, he could do so for normal rsync users, and for
190 > webrsync users not doing gpg verification, without detection. But should
191 > he try that with someone that's using webrsync-gpg, he has no way to sign
192 > the tampered with tarball with the correct private key since he doesn't
193 > have it, and those using webrsync with FEATURES=webrsync-gpg would detect
194 > the tampered tarball as portage (via webrsync, via eix in your case)
195 > would reject that tarball as unverified.
196 >
197
198 Well, maybe yes, maybe no as per the comment above, but agreed in general.
199
200 > So the hash-digest method used to protect ordinary rsync users (and
201 > webrsync users without webrsync-gpg turned on) from ACCIDENTAL damage,
202 > now protects webrsync-gpg users from DELIBERATE man-in-the-middle attacks
203 > as well, not because the digests themselves are different, but because we
204 > can now trust and verify that they came from a legitimate source.
205 >
206 > Tho it should be noted that "legitimate source" is defined as anyone
207 > having access to that that private signing key. So should someone breakin
208 > to the snapshotting server and steal that private key doing the signing,
209 > they now become a "legitimate source" as far as webrsync-gpg is concerned.
210 >
211
212 Yep.
213
214 >
215 > So where does that leave us in practice?
216 >
217 > Basically here:
218 >
219 > You're now verifying that the snapshot tarballs are coming from a source
220 > with the private signing key, and we're assuming that gentoo security
221 > hasn't been broken and thus that only gentoo's snapshot signing servers
222 > (and their admins, of course) have access to the private signing key,
223 > which in turn means we're assuming the machine with that signing key must
224 > be gentoo, and thus that the snapshotted tarballs are legit.
225 >
226 > But it's actually webrsync in combination with FEATURES=webrsync-gpg
227 > that's doing that verification.
228 >
229 > Once the verified tarball is actually unpacked on our system, portage
230 > operate just as it normally does, simply verifying the usual hash digests
231 > against the ebuilds and the distfiles /exactly/ as it normally would.
232 >
233
234 Understood.
235
236 > Repeating in different words to hopefully ensure it's understood:
237 >
238 > It's *ONLY* the fact that we have actually gpg-verified that snapshot
239 > tarball and thus the digests within it, that gives us any more security
240 > than an ordinary rsync user. After that's downloaded, verified and
241 > unpacked, portage operates exactly as it normally does.
242 >
243 >
244 > Meanwhile, part of that normal operation includes FEATURES=strict, if
245 > you've set it, which causes portage to refuse to merge the package if
246 > those digests don't match. But that part of things is just normal
247 > portage operation. Rsync users get it too -- they just don't have the
248 > additional assurance that those digest files actually came from gentoo
249 > (or at least from someone with gentoo's private signing key), that
250 > webrsync with FEATURES=webrsync-gpg provides.
251 >
252
253 Yep, I set that first before I got the gpg stuff working. I'll leave
254 it in place
255 for now.
256
257 >
258 > (Meanwhile, one further personal note FWIW. You may think that all these
259 > long explanations take quite some time to type up, and you'd be correct.
260 > But don't make the mistake of thinking that I don't get a benefit from it
261 > myself. My dad was a teacher, and one of the things he used to say that
262 > I've found to be truer than true, is that the best way to /learn/
263 > something is to try to teach it to someone. That's exactly what I'm
264 > doing, and all the unexpected questions and corner cases that I'd have
265 > never thought about on my own, that people bring up and force me to think
266 > about in ordered to answer them, help me improve my own previously more
267 > handwavy and fuzzy "general concept" understanding as well. I'm much
268 > more confident in my own understanding of the general public/private key
269 > concepts, how gpg actually uses them and how its web-of-trust works, and
270 > more specifically, how portage can use that via webrsync-gpg to actually
271 > improve the gentooer's own security, than I ever was before.
272 >
273 > And it has been quite some time since I worked with gpg and saw it in
274 > interactive mode like that, too, and it turns out that in the intervening
275 > years, I've actually understood quite a bit more about how it all works
276 > than I did back then, thus my ability to dig that all up and present it
277 > here, while back a few years ago, I was just as clueless about how all
278 > that web-of-trust stuff worked, and make exactly the same mistake of
279 > "ultimately trusting" the distro's package-signing key, for exactly the
280 > same reasons. Turns out I absorbed rather more from all those security
281 > and encryption articles I've read over the years than I realized, but it
282 > actually took my replies right here in this thread to lay it all out
283 > logically so I too realized how much more I understand what's going on
284 > now, than I did back then.)
285 >
286 > So... Thanks for the thread! =:^)
287 >
288 > --
289 > Duncan - List replies preferred. No HTML msgs.
290 > "Every nonfree program has a lord, a master --
291 > and if you use the program, he is your master." Richard Stallman
292 >
293 >

Replies