1 |
Mark Knecht posted on Thu, 07 Aug 2014 11:16:23 -0700 as excerpted: |
2 |
|
3 |
> So that's all looking pretty good, as a first step. If it's a matter of |
4 |
> 3 1/2 minutes instead of 1-2 minutes then I can live with that part. |
5 |
> However that's just (I think) the portage tree and not signed source |
6 |
> code, correct? |
7 |
|
8 |
[I just posted a reply to the gpg specific stuff.] |
9 |
|
10 |
Technically correct, but not really so in implementation. See below... |
11 |
|
12 |
> Now, is the idea that I have a validated portage snapshot at this point |
13 |
> and stiff have to actually get the code using the regular emerge which |
14 |
> will do the checking because I have: |
15 |
> |
16 |
> FEATURES="buildpkg strict webrsync-gpg" |
17 |
|
18 |
No... It doesn't work that way. |
19 |
|
20 |
> I don't see any evidence that emerge checked what it downloaded, but |
21 |
> maybe those checks are only done when I really build the code? |
22 |
|
23 |
Here's what happens. |
24 |
|
25 |
FEATURES=webrsync-gpg simply tells the webrsync stuff to gpg-verify the |
26 |
snapshot-tarball that webrsync downloads. Without that, it'd still |
27 |
download it the same, but it wouldn't verify the signature. This allows |
28 |
people who use the webrsync only because they're behind a firewall that |
29 |
wouldn't allow normal rsync, but who don't care about the gpg signing |
30 |
security stuff, to use the same tool as the people who actually use |
31 |
webrsync for the security aspect, regardless of whether they could use |
32 |
normal rsync or not. |
33 |
|
34 |
So that gets you a signed and verified tree. Correct so far. |
35 |
|
36 |
But as part of that tree, there are digest files for each package that |
37 |
verify the integrity of the ebuild as well as of the sources tarballs |
38 |
(distfiles). |
39 |
|
40 |
Now it's important to grasp the difference between gpg signing and simple |
41 |
hash digests, here. |
42 |
|
43 |
Anybody with the appropriate tools (md5sum, for example, does md5 hashes, |
44 |
but there's sha and other hashes as well, and the portage tree uses |
45 |
several hash algorithms in case one is broken) can take a hash of a file, |
46 |
and provided it's exactly the same bit-for-bit file they should get |
47 |
exactly the same hash. |
48 |
|
49 |
In fact, that's how portage checks the hashes of both the ebuild files |
50 |
and the distfiles it uses, regardless of this webrsync-gpg stuff. The |
51 |
tree ships the hash values that the gentoo package maintainer took of the |
52 |
files in its digest files, and portage takes its own hash of the files |
53 |
and compares it to the hash value stored in the digest files. If they |
54 |
match, portage is happy. If they don't, depending on how strict you have |
55 |
portage set to be (FEATURES=strict), it will either warn about (without |
56 |
strict) or entirely refuse to merge that package (with strict), until |
57 |
either the digest is updated, or a new file matching the old digest is |
58 |
downloaded. |
59 |
|
60 |
So far so good, but while the hashes protect against accidental damage as |
61 |
the file was being downloaded, because anyone can take a hash of the |
62 |
file, without something stronger, if say one of the mirror operators was |
63 |
a bad guy, they could replace the files with hacked files and as long as |
64 |
they replaced the digest files with the new ones they created for the |
65 |
hacked files at the same time, portage wouldn't know. |
66 |
|
67 |
So while hashes/digests alone protect quite well from accidental damage, |
68 |
they can't protect, by themselves, from deliberate replacement of those |
69 |
files with malware infested copies. |
70 |
|
71 |
Which is where the gpg signed tree snapshots come in. But before we can |
72 |
understand how they help, we need to understand how gpg signing differs |
73 |
from simple hashes. |
74 |
|
75 |
PGP, gpg, and various other public/private-pair key signing (and |
76 |
encryption) take advantage of a particular mathematical relationship |
77 |
property between the public and private keys. I'm not a cryptographer |
78 |
nor a mathematician, so I'm content to leave it at that rather handwavy |
79 |
assertion and not get into the details, but enough people I trust say the |
80 |
same thing about the details, and enough of our modern Internet banking |
81 |
and the like, depends upon the same idea, that I'm relatively confident |
82 |
in the general principle, at least. |
83 |
|
84 |
It works like this. People keep the private key from the pair private -- |
85 |
if it gets out, they've lost the secret. But people publish the public |
86 |
half of the key. The relationship of the keys is such that people can't |
87 |
figure out the private key from the public key, but if you have the |
88 |
private key, you can sign stuff with it, and people with the public key |
89 |
can verify the signature and thus trust that it really was the person |
90 |
with that key that signed the content. Similarly, people can use the |
91 |
public key to encrypt something, and only the person with the private key |
92 |
will be able to decrypt it -- having the public key doesn't help. |
93 |
|
94 |
Actually, as I understand it signing is simply a combination of hashing |
95 |
and encryption, such that a hash of the content to be signed is taken, |
96 |
and then that hash is encrypted with the private key. Now anyone with |
97 |
the public key can "decrypt" the hash and verify the content with it, |
98 |
thereby verifying that the private key used to sign the content by |
99 |
encrypting the hash was the one used. If some other key had been used, |
100 |
attempting to decrypt the hash with an unmatched public key would simply |
101 |
produce gibberish, and the supposedly "decrypted" hash wouldn't be the |
102 |
hash produced when checking the content, thereby failing to verify that |
103 |
the signed content actually came from the person that it was claimed to |
104 |
have come from. |
105 |
|
106 |
|
107 |
OK, we've now established that hashes simply verify that the content |
108 |
didn't get modified in transit, but they do NOT by themselves verify who |
109 |
SENT that content, so indeed, a man-in-the-middle could have replaced |
110 |
BOTH the content and the hash, and someone relying on just hashes |
111 |
couldn't tell the difference. |
112 |
|
113 |
And we've also established that a signature verifies that the content |
114 |
actually came from the person who had the private key matching the public |
115 |
key used to verify it, by mechanism of encrypting the hash of that |
116 |
content with the private key, so only by "decrypting" it with the |
117 |
matching public key, does the hash of the content match the one taken at |
118 |
the other end and encrypted with the private key. |
119 |
|
120 |
*NOW* we're equipped to see how the portage tree snapshot signing method |
121 |
actually allows us to verify distfiles as well. Because the tree |
122 |
includes digests that we can now verify came from our trusted source, |
123 |
gentoo, NOW those digests can be used to verify the distfiles, because |
124 |
the digests were part of the signed tree and nobody could tamper with |
125 |
that signed tree including those digests without detection. |
126 |
|
127 |
If our nefarious gentoo mirror operator tried to switch out the source |
128 |
tarballs AND the digests, he could do so for normal rsync users, and for |
129 |
webrsync users not doing gpg verification, without detection. But should |
130 |
he try that with someone that's using webrsync-gpg, he has no way to sign |
131 |
the tampered with tarball with the correct private key since he doesn't |
132 |
have it, and those using webrsync with FEATURES=webrsync-gpg would detect |
133 |
the tampered tarball as portage (via webrsync, via eix in your case) |
134 |
would reject that tarball as unverified. |
135 |
|
136 |
So the hash-digest method used to protect ordinary rsync users (and |
137 |
webrsync users without webrsync-gpg turned on) from ACCIDENTAL damage, |
138 |
now protects webrsync-gpg users from DELIBERATE man-in-the-middle attacks |
139 |
as well, not because the digests themselves are different, but because we |
140 |
can now trust and verify that they came from a legitimate source. |
141 |
|
142 |
Tho it should be noted that "legitimate source" is defined as anyone |
143 |
having access to that that private signing key. So should someone breakin |
144 |
to the snapshotting server and steal that private key doing the signing, |
145 |
they now become a "legitimate source" as far as webrsync-gpg is concerned. |
146 |
|
147 |
|
148 |
So where does that leave us in practice? |
149 |
|
150 |
Basically here: |
151 |
|
152 |
You're now verifying that the snapshot tarballs are coming from a source |
153 |
with the private signing key, and we're assuming that gentoo security |
154 |
hasn't been broken and thus that only gentoo's snapshot signing servers |
155 |
(and their admins, of course) have access to the private signing key, |
156 |
which in turn means we're assuming the machine with that signing key must |
157 |
be gentoo, and thus that the snapshotted tarballs are legit. |
158 |
|
159 |
But it's actually webrsync in combination with FEATURES=webrsync-gpg |
160 |
that's doing that verification. |
161 |
|
162 |
Once the verified tarball is actually unpacked on our system, portage |
163 |
operate just as it normally does, simply verifying the usual hash digests |
164 |
against the ebuilds and the distfiles /exactly/ as it normally would. |
165 |
|
166 |
Repeating in different words to hopefully ensure it's understood: |
167 |
|
168 |
It's *ONLY* the fact that we have actually gpg-verified that snapshot |
169 |
tarball and thus the digests within it, that gives us any more security |
170 |
than an ordinary rsync user. After that's downloaded, verified and |
171 |
unpacked, portage operates exactly as it normally does. |
172 |
|
173 |
|
174 |
Meanwhile, part of that normal operation includes FEATURES=strict, if |
175 |
you've set it, which causes portage to refuse to merge the package if |
176 |
those digests don't match. But that part of things is just normal |
177 |
portage operation. Rsync users get it too -- they just don't have the |
178 |
additional assurance that those digest files actually came from gentoo |
179 |
(or at least from someone with gentoo's private signing key), that |
180 |
webrsync with FEATURES=webrsync-gpg provides. |
181 |
|
182 |
|
183 |
(Meanwhile, one further personal note FWIW. You may think that all these |
184 |
long explanations take quite some time to type up, and you'd be correct. |
185 |
But don't make the mistake of thinking that I don't get a benefit from it |
186 |
myself. My dad was a teacher, and one of the things he used to say that |
187 |
I've found to be truer than true, is that the best way to /learn/ |
188 |
something is to try to teach it to someone. That's exactly what I'm |
189 |
doing, and all the unexpected questions and corner cases that I'd have |
190 |
never thought about on my own, that people bring up and force me to think |
191 |
about in ordered to answer them, help me improve my own previously more |
192 |
handwavy and fuzzy "general concept" understanding as well. I'm much |
193 |
more confident in my own understanding of the general public/private key |
194 |
concepts, how gpg actually uses them and how its web-of-trust works, and |
195 |
more specifically, how portage can use that via webrsync-gpg to actually |
196 |
improve the gentooer's own security, than I ever was before. |
197 |
|
198 |
And it has been quite some time since I worked with gpg and saw it in |
199 |
interactive mode like that, too, and it turns out that in the intervening |
200 |
years, I've actually understood quite a bit more about how it all works |
201 |
than I did back then, thus my ability to dig that all up and present it |
202 |
here, while back a few years ago, I was just as clueless about how all |
203 |
that web-of-trust stuff worked, and make exactly the same mistake of |
204 |
"ultimately trusting" the distro's package-signing key, for exactly the |
205 |
same reasons. Turns out I absorbed rather more from all those security |
206 |
and encryption articles I've read over the years than I realized, but it |
207 |
actually took my replies right here in this thread to lay it all out |
208 |
logically so I too realized how much more I understand what's going on |
209 |
now, than I did back then.) |
210 |
|
211 |
So... Thanks for the thread! =:^) |
212 |
|
213 |
-- |
214 |
Duncan - List replies preferred. No HTML msgs. |
215 |
"Every nonfree program has a lord, a master -- |
216 |
and if you use the program, he is your master." Richard Stallman |