1 |
Hello all, |
2 |
|
3 |
I flood you again with a looooong email. Apologies to all that don't |
4 |
want to read so much, but it is a problem of rather high importance that |
5 |
has not really been fixed, and the first discussions happened in 2003 as |
6 |
far as I can tell. Time to FIX IT!!! |
7 |
|
8 |
The problem, in short, is how to handle the checksumming and signing of |
9 |
gentoo-provided files so that manipulation by external entities becomes |
10 |
difficult. I expect many disagreements on the "best" strategy to |
11 |
implement, but I hope that a sensible compromise will be reached so that |
12 |
this can finally be implemented. |
13 |
|
14 |
All the lazy people may stop reading here ;-) |
15 |
|
16 |
Short overview: |
17 |
The Problem |
18 |
The Attacker |
19 |
Defending |
20 |
Policies and open problems |
21 |
|
22 |
|
23 |
|
24 |
The Problem: |
25 |
============ |
26 |
|
27 |
A malicious person could modify the files provided by Gentoo to |
28 |
manipulate and take over the computers of Gentoo users. To avoid such |
29 |
problems all files provided and used by Gentoo need to be identifiable |
30 |
as "correct" - we need integrity checks. |
31 |
|
32 |
An attacker should not be able to easily circumvent these checks. There |
33 |
are some attacks that can't be prevented, so we also have to see the |
34 |
practical limits of any scheme we define - for example an attacker |
35 |
could |
36 |
be a Gentoo dev with full access to all ressources, stopping that |
37 |
person |
38 |
will be more difficult (if not impossible) than stopping a random |
39 |
script |
40 |
kiddie that hax0rs a distfile mirror with a 0-day exploit. |
41 |
|
42 |
The files |
43 |
========= |
44 |
|
45 |
There are two groups of files at the moment that need to be secured: |
46 |
- distfiles: The large archives of source code and binary blobs from |
47 |
which we install a package |
48 |
- "the tree": metadata, ebuilds and patches containing all the |
49 |
information to manage the local software installation. |
50 |
|
51 |
The default distribution methods are rsync for the tree and http/ftp |
52 |
for |
53 |
distfiles. As there are too many users for a single server theservers |
54 |
are |
55 |
provided by external contributors and are not directly controlled by |
56 |
Gentoo. In almost all cases a fallback to the original download |
57 |
location |
58 |
of a file is provided. |
59 |
|
60 |
The Attacker |
61 |
============ |
62 |
|
63 |
Any security policy has to take into account how strong an attacker is. |
64 |
For example securing against your grandmother with checksums signed by |
65 |
multiple independent persons is most likely overkill. A simple checksum |
66 |
would most likely be enough there. |
67 |
On the other end of the spectrum we have aliens that can crack any |
68 |
encryption scheme in roughly two minutes, obviously we can't do |
69 |
anything |
70 |
to really stop them. |
71 |
|
72 |
What attackers are then reasonable? |
73 |
- the script kiddie that takes over one single mirror |
74 |
- a large multinational monopolist that tries to sabotage any potential |
75 |
competitors |
76 |
- a mirror operator that has a bad days and manipulates files for fun |
77 |
- a really strong hax0r that takes over the Gentoo CVS server |
78 |
- a social hacker that takes a dev hostage and forces that dev to |
79 |
insert |
80 |
evil bad data |
81 |
|
82 |
This is by far not a complete list, it should only help with figuring |
83 |
out what can go wrong. |
84 |
|
85 |
Now let's classify the attackers: |
86 |
* local attacker ("your roommate") - nothing we can defend against, |
87 |
your |
88 |
responsability. |
89 |
* single compromised mirror - only with checksums can this be found. If |
90 |
the checksums are distributed on a different path than the distfiles |
91 |
a single compromised mirror has a very low impact as checksums don't |
92 |
match. |
93 |
* compromised rsync mirror - now the checksums can be forged. The |
94 |
attacker will have to change the SRC_URI too so that only the |
95 |
compromised distfiles are transferred. Also changes in the ebuilds must |
96 |
be considered - a "rm -rf" in the right place in an ebuild will have a |
97 |
large impact and can't be caught with checksums (since those could be |
98 |
forged by the same attacker). We need signed checksums here. |
99 |
* compromised developer - this is hard to detect, but once detected all |
100 |
files involved can be checked and corrected. The impact of this is very |
101 |
high, it is very difficult to avoid. (So we just assume that no dev |
102 |
will |
103 |
go berserk and look for low-impact methods that allow us to clean up if |
104 |
that ever happens) |
105 |
|
106 |
Note: a possible defense against rogue devs would be multi-signing, i.e. |
107 |
having all commits checked by at least one other person. This does not |
108 |
help much as there can be collusion between devs and the impact on all |
109 |
devs is very high. It would effectively deadlock Gentoo and prevent any |
110 |
useful progress. |
111 |
|
112 |
Defense methods |
113 |
=============== |
114 |
|
115 |
1) Checksums |
116 |
A Checksum is a one-way function that returns a constant-length |
117 |
identifier. The checksum is designed so that changing one bit in the |
118 |
input totally changes the output (quite simplified, but that's all that |
119 |
matters). Thus any changes to a file lead to a bad checksum, finding a |
120 |
collision (two files with the same checksum) is hard. |
121 |
Some checksum algorithms have known weaknesses, so relying on a single |
122 |
algorithm is not advised. For example MD5 suffers from precomputation |
123 |
attacks where one can generate two files with equal checksums (but it |
124 |
is |
125 |
not possible to find a matching second file to a given file). |
126 |
|
127 |
2) Signatures |
128 |
Using GPG it is possible to sign a file. The signature is similar to a |
129 |
checksum, but it can only be created with a private key that is kept |
130 |
secret. The public key allows to verify this signature. Deducing the |
131 |
private key from the public key is hard to do. (very simplified) |
132 |
The public key is provided online, in a keyring (collection of keys) or |
133 |
included in the downloadable media. If the public key is trusted it can |
134 |
be used to verify that all files have a correct signature, effectively |
135 |
saying that the files are exactly the same as the ones committed by a |
136 |
dev. |
137 |
|
138 |
Some readers may point out that it doesn't prevent a dev from injecting |
139 |
"bad" files and sign them, but this prevents tampering by external |
140 |
parties. |
141 |
|
142 |
3) Manifest / Manifest2 |
143 |
|
144 |
This is an implementation of a checksum / signature scheme. It is |
145 |
described in GLEP 44: |
146 |
|
147 |
http://www.gentoo.org/proj/en/glep/glep-0044.html |
148 |
|
149 |
Right now SHA1, SHA256, RMD160 are the default checksum algorithms |
150 |
|
151 |
While manifest2 should take care of all executable bits in the tree it |
152 |
does not yet cover eclasses and profiles. As long as this is not taken |
153 |
care off any attacker can just override an eclass on the rsync mirror or |
154 |
modify the profiles. This has a severe negative impact on signing |
155 |
effectivity. |
156 |
|
157 |
Any "good" solution should sign all data files in the tree, so I ask for |
158 |
an extension of the Manifest2 protocol to include _every_ data file with |
159 |
no exception. |
160 |
|
161 |
Key policies |
162 |
============ |
163 |
|
164 |
To make signing relevant and verifiable all devs should use the same |
165 |
parameters - key length, key type, validity. |
166 |
Once that is agreed upon a key distribution strategy is needed so that |
167 |
users can get the key(s) on a verifiable path. |
168 |
|
169 |
Signing strategies |
170 |
================== |
171 |
|
172 |
Once there is an agreement on what files to sign with what kind of keys |
173 |
there remains the question how to sign it. There are at least three |
174 |
strategies: |
175 |
|
176 |
Method "simple": |
177 |
---------------- |
178 |
|
179 |
Use one central key that is kept on a secure box. Signing is done |
180 |
automatically after a commit. The key distribution is simple since |
181 |
there |
182 |
is only one key that has to be pushed. |
183 |
Problems are security (single point of failure, single target for |
184 |
compromising) |
185 |
|
186 |
Method "complex": |
187 |
----------------- |
188 |
|
189 |
Let every dev sign the files he adds or modifies. A keyring is |
190 |
maintained on gentoo infrastructure and is distributed over multiple |
191 |
paths. |
192 |
Problems: Need support for multi-signing. If one file is added the |
193 |
manifest should not be only signed by the last editor, only the change |
194 |
should be signed. At the same time it needs to be kept simple and fast, |
195 |
ssigning each file on its own or keeping infinite history must be |
196 |
avoided. Keyring managment needs to be defined. Key revocation etc. |
197 |
needs to be defined. |
198 |
|
199 |
Method "hybrid": |
200 |
---------------- |
201 |
|
202 |
Let every dev sign, add automatic server-side signing with a master key. |
203 |
Gives you bits of both. Normal users can trust the master key. |
204 |
Paranoid users can trust the dev keys. |
205 |
|
206 |
|
207 |
Earlier Discussions: |
208 |
|
209 |
http://article.gmane.org/gmane.linux.gentoo.devel/16876 |
210 |
2004.1 discussion |
211 |
|
212 |
http://www.gentoo.org/proj/en/devrel/manager-meetings/logs/2004/20040531.txt |
213 |
manager meeting |
214 |
|
215 |
Some selected problems from there: |
216 |
|
217 |
* Access Control Lists could be used so that only toolchain people can |
218 |
commit to glibc. Do we want that level of micromanagment? Does it offer |
219 |
any security benefits? |
220 |
|
221 |
* key revocation may be impractical - what methods for handling retired |
222 |
devs and rogue devs are there? |
223 |
|
224 |
* how to verify from an install CD ? |
225 |
|
226 |
* in tree or out of band? Storing the keys in the tree is easy, but a |
227 |
potential security problem |
228 |
|
229 |
With this I hope to get the discussion started. There are many areas |
230 |
where I am unsure what is the best strategy - every decision has obvious |
231 |
disadvantages, either security, code complexity or developer workload. |
232 |
Any solution should try to keep the workload low while offering the |
233 |
highest level of security that does not halt all progress. |
234 |
|
235 |
I hope that discussion can stay focussed on the implementation aspects. |
236 |
When you suggest something (for example multiple signatures) please |
237 |
explain what it gains us (protection against single rogue devs) and at |
238 |
what price (having to sign everything by at least two persons). That |
239 |
should make it easier to see if the workload impact of that idea is |
240 |
worth it. |
241 |
|
242 |
Take care, |
243 |
|
244 |
Patrick |
245 |
|
246 |
-- |
247 |
Stand still, and let the rest of the universe move |