1 |
Hi Duncan, |
2 |
Responding to one thing here, the rest in-line: |
3 |
|
4 |
[QUOTE] |
5 |
(Meanwhile, one further personal note FWIW. You may think that all these |
6 |
long explanations take quite some time to type up, and you'd be correct. |
7 |
But don't make the mistake of thinking that I don't get a benefit from it |
8 |
myself. My dad was a teacher, and one of the things he used to say that |
9 |
I've found to be truer than true, is that the best way to /learn/ |
10 |
something is to try to teach it to someone. |
11 |
[/QUOTE] |
12 |
|
13 |
I couldn't agree more and appreciate your efforts. And even if I might |
14 |
already understand some of what you document I'm sure there are |
15 |
others that come later looking for answers who get lots from these |
16 |
conversations, solve problems and we never hear about it. Anyway, |
17 |
a big thanks. |
18 |
|
19 |
On Thu, Aug 7, 2014 at 2:18 PM, Duncan <1i5t5.duncan@×××.net> wrote: |
20 |
> Mark Knecht posted on Thu, 07 Aug 2014 11:16:23 -0700 as excerpted: |
21 |
> |
22 |
>> So that's all looking pretty good, as a first step. If it's a matter of |
23 |
>> 3 1/2 minutes instead of 1-2 minutes then I can live with that part. |
24 |
>> However that's just (I think) the portage tree and not signed source |
25 |
>> code, correct? |
26 |
> |
27 |
> [I just posted a reply to the gpg specific stuff.] |
28 |
> |
29 |
> Technically correct, but not really so in implementation. See below... |
30 |
> |
31 |
>> Now, is the idea that I have a validated portage snapshot at this point |
32 |
>> and stiff have to actually get the code using the regular emerge which |
33 |
>> will do the checking because I have: |
34 |
>> |
35 |
>> FEATURES="buildpkg strict webrsync-gpg" |
36 |
> |
37 |
> No... It doesn't work that way. |
38 |
> |
39 |
>> I don't see any evidence that emerge checked what it downloaded, but |
40 |
>> maybe those checks are only done when I really build the code? |
41 |
> |
42 |
> Here's what happens. |
43 |
> |
44 |
> FEATURES=webrsync-gpg simply tells the webrsync stuff to gpg-verify the |
45 |
> snapshot-tarball that webrsync downloads. Without that, it'd still |
46 |
> download it the same, but it wouldn't verify the signature. This allows |
47 |
> people who use the webrsync only because they're behind a firewall that |
48 |
> wouldn't allow normal rsync, but who don't care about the gpg signing |
49 |
> security stuff, to use the same tool as the people who actually use |
50 |
> webrsync for the security aspect, regardless of whether they could use |
51 |
> normal rsync or not. |
52 |
> |
53 |
|
54 |
And to clarify, I believe this step is responsible for putting into place on |
55 |
a Gentoo machine much of what's in /usr/portage, most specifically in the |
56 |
app categorization directories. In the old days the Gentoo Install Guide |
57 |
used to have us download the portage snapshots for a location such as |
58 |
|
59 |
http://distfiles.gentoo.org/snapshots/ |
60 |
|
61 |
That's now been replaced by a call to emerge-webrsync so newbies |
62 |
might not have that view. Additionally, even if we're downloading the |
63 |
snapshot tarball it appears, at least on my system, it's deleted after |
64 |
it's expanded/ Or at least it's not showing up in a locate command. |
65 |
|
66 |
|
67 |
> So that gets you a signed and verified tree. Correct so far. |
68 |
> |
69 |
> But as part of that tree, there are digest files for each package that |
70 |
> verify the integrity of the ebuild as well as of the sources tarballs |
71 |
> (distfiles). |
72 |
> |
73 |
|
74 |
Yep. |
75 |
|
76 |
> Now it's important to grasp the difference between gpg signing and simple |
77 |
> hash digests, here. |
78 |
> |
79 |
> Anybody with the appropriate tools (md5sum, for example, does md5 hashes, |
80 |
> but there's sha and other hashes as well, and the portage tree uses |
81 |
> several hash algorithms in case one is broken) can take a hash of a file, |
82 |
> and provided it's exactly the same bit-for-bit file they should get |
83 |
> exactly the same hash. |
84 |
> |
85 |
> In fact, that's how portage checks the hashes of both the ebuild files |
86 |
> and the distfiles it uses, regardless of this webrsync-gpg stuff. The |
87 |
> tree ships the hash values that the gentoo package maintainer took of the |
88 |
> files in its digest files, and portage takes its own hash of the files |
89 |
> and compares it to the hash value stored in the digest files. If they |
90 |
> match, portage is happy. If they don't, depending on how strict you have |
91 |
> portage set to be (FEATURES=strict), it will either warn about (without |
92 |
> strict) or entirely refuse to merge that package (with strict), until |
93 |
> either the digest is updated, or a new file matching the old digest is |
94 |
> downloaded. |
95 |
> |
96 |
> So far so good, but while the hashes protect against accidental damage as |
97 |
> the file was being downloaded, because anyone can take a hash of the |
98 |
> file, without something stronger, if say one of the mirror operators was |
99 |
> a bad guy, they could replace the files with hacked files and as long as |
100 |
> they replaced the digest files with the new ones they created for the |
101 |
> hacked files at the same time, portage wouldn't know. |
102 |
> |
103 |
> So while hashes/digests alone protect quite well from accidental damage, |
104 |
> they can't protect, by themselves, from deliberate replacement of those |
105 |
> files with malware infested copies. |
106 |
> |
107 |
> Which is where the gpg signed tree snapshots come in. But before we can |
108 |
> understand how they help, we need to understand how gpg signing differs |
109 |
> from simple hashes. |
110 |
> |
111 |
|
112 |
Some years ago (1997/98) I purchased one of Bruce Schneier's books - looking |
113 |
at Amazon I recollect "Applied Cryptography: Protocols, Algorithms, and |
114 |
Source Code in C" - so I've been through a lot of this in the area of |
115 |
semiconductor |
116 |
design. (5C Encryption model for 'protecting' movie content. What a joke...) |
117 |
|
118 |
> PGP, gpg, and various other public/private-pair key signing (and |
119 |
> encryption) take advantage of a particular mathematical relationship |
120 |
> property between the public and private keys. I'm not a cryptographer |
121 |
> nor a mathematician, so I'm content to leave it at that rather handwavy |
122 |
> assertion and not get into the details, but enough people I trust say the |
123 |
> same thing about the details, and enough of our modern Internet banking |
124 |
> and the like, depends upon the same idea, that I'm relatively confident |
125 |
> in the general principle, at least. |
126 |
> |
127 |
> It works like this. People keep the private key from the pair private -- |
128 |
> if it gets out, they've lost the secret. But people publish the public |
129 |
> half of the key. The relationship of the keys is such that people can't |
130 |
> figure out the private key from the public key, but if you have the |
131 |
> private key, you can sign stuff with it, and people with the public key |
132 |
> can verify the signature and thus trust that it really was the person |
133 |
> with that key that signed the content. Similarly, people can use the |
134 |
> public key to encrypt something, and only the person with the private key |
135 |
> will be able to decrypt it -- having the public key doesn't help. |
136 |
> |
137 |
> Actually, as I understand it signing is simply a combination of hashing |
138 |
> and encryption, such that a hash of the content to be signed is taken, |
139 |
> and then that hash is encrypted with the private key. Now anyone with |
140 |
> the public key can "decrypt" the hash and verify the content with it, |
141 |
> thereby verifying that the private key used to sign the content by |
142 |
> encrypting the hash was the one used. If some other key had been used, |
143 |
> attempting to decrypt the hash with an unmatched public key would simply |
144 |
> produce gibberish, and the supposedly "decrypted" hash wouldn't be the |
145 |
> hash produced when checking the content, thereby failing to verify that |
146 |
> the signed content actually came from the person that it was claimed to |
147 |
> have come from. |
148 |
> |
149 |
|
150 |
If I recall correctly the flow looks like: |
151 |
|
152 |
File -> (Sender Private/Receiver Public) -> Encrypted File |
153 |
|
154 |
Encrypted File -> (Sender Public/Receiver Private) -> File |
155 |
|
156 |
and this should be safe, albeit Rich's comment early on was |
157 |
|
158 |
"3. Have an army of the best cryptographers in the world, etc." |
159 |
|
160 |
coupled with lots of compute power leaves me with little doubt it's |
161 |
not a 100% thing... |
162 |
|
163 |
> |
164 |
> OK, we've now established that hashes simply verify that the content |
165 |
> didn't get modified in transit, but they do NOT by themselves verify who |
166 |
> SENT that content, so indeed, a man-in-the-middle could have replaced |
167 |
> BOTH the content and the hash, and someone relying on just hashes |
168 |
> couldn't tell the difference. |
169 |
> |
170 |
> And we've also established that a signature verifies that the content |
171 |
> actually came from the person who had the private key matching the public |
172 |
> key used to verify it, by mechanism of encrypting the hash of that |
173 |
> content with the private key, so only by "decrypting" it with the |
174 |
> matching public key, does the hash of the content match the one taken at |
175 |
> the other end and encrypted with the private key. |
176 |
> |
177 |
> *NOW* we're equipped to see how the portage tree snapshot signing method |
178 |
> actually allows us to verify distfiles as well. Because the tree |
179 |
> includes digests that we can now verify came from our trusted source, |
180 |
> gentoo, NOW those digests can be used to verify the distfiles, because |
181 |
> the digests were part of the signed tree and nobody could tamper with |
182 |
> that signed tree including those digests without detection. |
183 |
> |
184 |
|
185 |
Correct. Hashes for all that stuff is in the Manifest files and I don't create |
186 |
my own Manifests ever. |
187 |
|
188 |
> If our nefarious gentoo mirror operator tried to switch out the source |
189 |
> tarballs AND the digests, he could do so for normal rsync users, and for |
190 |
> webrsync users not doing gpg verification, without detection. But should |
191 |
> he try that with someone that's using webrsync-gpg, he has no way to sign |
192 |
> the tampered with tarball with the correct private key since he doesn't |
193 |
> have it, and those using webrsync with FEATURES=webrsync-gpg would detect |
194 |
> the tampered tarball as portage (via webrsync, via eix in your case) |
195 |
> would reject that tarball as unverified. |
196 |
> |
197 |
|
198 |
Well, maybe yes, maybe no as per the comment above, but agreed in general. |
199 |
|
200 |
> So the hash-digest method used to protect ordinary rsync users (and |
201 |
> webrsync users without webrsync-gpg turned on) from ACCIDENTAL damage, |
202 |
> now protects webrsync-gpg users from DELIBERATE man-in-the-middle attacks |
203 |
> as well, not because the digests themselves are different, but because we |
204 |
> can now trust and verify that they came from a legitimate source. |
205 |
> |
206 |
> Tho it should be noted that "legitimate source" is defined as anyone |
207 |
> having access to that that private signing key. So should someone breakin |
208 |
> to the snapshotting server and steal that private key doing the signing, |
209 |
> they now become a "legitimate source" as far as webrsync-gpg is concerned. |
210 |
> |
211 |
|
212 |
Yep. |
213 |
|
214 |
> |
215 |
> So where does that leave us in practice? |
216 |
> |
217 |
> Basically here: |
218 |
> |
219 |
> You're now verifying that the snapshot tarballs are coming from a source |
220 |
> with the private signing key, and we're assuming that gentoo security |
221 |
> hasn't been broken and thus that only gentoo's snapshot signing servers |
222 |
> (and their admins, of course) have access to the private signing key, |
223 |
> which in turn means we're assuming the machine with that signing key must |
224 |
> be gentoo, and thus that the snapshotted tarballs are legit. |
225 |
> |
226 |
> But it's actually webrsync in combination with FEATURES=webrsync-gpg |
227 |
> that's doing that verification. |
228 |
> |
229 |
> Once the verified tarball is actually unpacked on our system, portage |
230 |
> operate just as it normally does, simply verifying the usual hash digests |
231 |
> against the ebuilds and the distfiles /exactly/ as it normally would. |
232 |
> |
233 |
|
234 |
Understood. |
235 |
|
236 |
> Repeating in different words to hopefully ensure it's understood: |
237 |
> |
238 |
> It's *ONLY* the fact that we have actually gpg-verified that snapshot |
239 |
> tarball and thus the digests within it, that gives us any more security |
240 |
> than an ordinary rsync user. After that's downloaded, verified and |
241 |
> unpacked, portage operates exactly as it normally does. |
242 |
> |
243 |
> |
244 |
> Meanwhile, part of that normal operation includes FEATURES=strict, if |
245 |
> you've set it, which causes portage to refuse to merge the package if |
246 |
> those digests don't match. But that part of things is just normal |
247 |
> portage operation. Rsync users get it too -- they just don't have the |
248 |
> additional assurance that those digest files actually came from gentoo |
249 |
> (or at least from someone with gentoo's private signing key), that |
250 |
> webrsync with FEATURES=webrsync-gpg provides. |
251 |
> |
252 |
|
253 |
Yep, I set that first before I got the gpg stuff working. I'll leave |
254 |
it in place |
255 |
for now. |
256 |
|
257 |
> |
258 |
> (Meanwhile, one further personal note FWIW. You may think that all these |
259 |
> long explanations take quite some time to type up, and you'd be correct. |
260 |
> But don't make the mistake of thinking that I don't get a benefit from it |
261 |
> myself. My dad was a teacher, and one of the things he used to say that |
262 |
> I've found to be truer than true, is that the best way to /learn/ |
263 |
> something is to try to teach it to someone. That's exactly what I'm |
264 |
> doing, and all the unexpected questions and corner cases that I'd have |
265 |
> never thought about on my own, that people bring up and force me to think |
266 |
> about in ordered to answer them, help me improve my own previously more |
267 |
> handwavy and fuzzy "general concept" understanding as well. I'm much |
268 |
> more confident in my own understanding of the general public/private key |
269 |
> concepts, how gpg actually uses them and how its web-of-trust works, and |
270 |
> more specifically, how portage can use that via webrsync-gpg to actually |
271 |
> improve the gentooer's own security, than I ever was before. |
272 |
> |
273 |
> And it has been quite some time since I worked with gpg and saw it in |
274 |
> interactive mode like that, too, and it turns out that in the intervening |
275 |
> years, I've actually understood quite a bit more about how it all works |
276 |
> than I did back then, thus my ability to dig that all up and present it |
277 |
> here, while back a few years ago, I was just as clueless about how all |
278 |
> that web-of-trust stuff worked, and make exactly the same mistake of |
279 |
> "ultimately trusting" the distro's package-signing key, for exactly the |
280 |
> same reasons. Turns out I absorbed rather more from all those security |
281 |
> and encryption articles I've read over the years than I realized, but it |
282 |
> actually took my replies right here in this thread to lay it all out |
283 |
> logically so I too realized how much more I understand what's going on |
284 |
> now, than I did back then.) |
285 |
> |
286 |
> So... Thanks for the thread! =:^) |
287 |
> |
288 |
> -- |
289 |
> Duncan - List replies preferred. No HTML msgs. |
290 |
> "Every nonfree program has a lord, a master -- |
291 |
> and if you use the program, he is your master." Richard Stallman |
292 |
> |
293 |
> |