Gentoo Archives: gentoo-amd64

From: Duncan <1i5t5.duncan@×××.net>
To: gentoo-amd64@l.g.o
Subject: [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
Date: Tue, 05 Aug 2014 05:52:25
Message-Id: pan$d1f99$43cd96ae$721d7088$6ef9ba6d@cox.net
In Reply to: [gentoo-amd64] "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?) by Mark Knecht
1 Mark Knecht posted on Mon, 04 Aug 2014 15:04:12 -0700 as excerpted:
2
3 > As the line in that favorite song goes "Paranoia strikes deep"...
4
5 FWIW, while my lists sig is the proprietary-master quote from Richard
6 Stallman below, since the (anti-)patriot bill was passed in the reaction
7 to 9-11, my private email sig is a famous quote from Benjamin Franklin:
8
9 "They that can give up essential liberty to obtain a little
10 temporary safety, deserve neither liberty nor safety."
11
12 So "I'm with ya..."
13
14 > <NOTE>
15 > I am NOT trying to start ANY political discussion here. I hope no one
16 > will go too far down that path, at least here on this list. There are
17 > better places to do that.
18 >
19 > I am also NOT suggesting anything like what I ask next has happened,
20 > either here or elsewhere. It's just a question.
21 >
22 > Thanks in advance.
23 > </NOTE>
24 >
25 > I'm currently reading a new book by Glen Greenwald called "No Place To
26 > Hide" which is about Greenwald's introduction to Edward Snowden and the
27 > release of all of the confidential NSA documents Snowden acquired. This
28 > got me wondering about Gentoo, or even just Linux in general. If the
29 > underlying issue in all of that Snowden stuff is that the NSA has the
30 > ability to intercept and hack into whatever they please, then how do I
31 > know that the source code I build on my Gentoo machines hasn't been
32 > modified by someone to provide access to my machine, networks, etc.?
33
34 These are good questions to ask, and to have some idea of the answers to,
35 as well.
36
37 Big picture, at some level, you pretty much have to accept that you
38 /don't/ know. However, there's /some/ level of security... tho honestly
39 a bit less on Gentoo than on some of the other distros (see below), tho
40 it'd still not be /entirely/ easy to subvert at least widely (for an
41 individual downloader is another question), but it could be done.
42
43 > Essentially, what is the security model for all this source code and how
44 > do I verify that it hasn't been tampered with in some manner?
45 >
46 > 1) That the code I build is exactly as written and accepted by the OS
47 > community?
48
49 At a basic level, source and ebuild integrity, protecting both from
50 accidental corruption (where it's pretty good) and from deliberate
51 tampering (where it may or may not be considered "acceptable", but if
52 someone with the resources wanted to bad enough, they could subvert), is
53 what ebuild and sources digests are all about. The idea is that the
54 gentoo package maintainer creates hash digests of multiple types for both
55 the ebuild and the sources, such that should the copy that a gentoo user
56 gets not match the copy that a gentoo maintainer created, the package
57 manager (PM, normally portage), if configured to do so (mainly
58 FEATURES=strict, also see stricter and assume-digests, plus the webrsync-
59 gpg feature mentioned below) will error out and refuse to emerge that
60 package.
61
62 But there are serious limits to that protection. Here's a few points to
63 consider:
64
65 1) While the ebuilds and sources are digested, those digests do *NOT*
66 extend to the rest of the tree, the various files in the profile
67 directory, the various eclasses, etc. So in theory at least, someone
68 could mess with say the package.mask file in profiles, or one of the
69 eclasses, and could potentially get away with it. But see point #3 as
70 there's a (partial) workaround for the paranoid.
71
72 2) Meanwhile, since hashing (unlike gpg signing) isn't designed to be
73 secure, primarily protecting against accidental damage not so much
74 deliberate compromise, with digest verification verifying that nothing
75 changed in transit but not who did the digest in the first place, there's
76 some risk that one or more gentoo rsync mirrors could be compromised or
77 be run by a bad actor in the first place. Should that occur, the bad
78 actor could attempt to replace BOTH the digested ebuild and/or sources
79 AND the digest files, updating the latter to reflect his compromised
80 version instead of the version originally digested by the gentoo
81 maintainer. Similarly, someone such as the NSA could at least in theory
82 do the same thing in transit, targeting a specific user's downloads while
83 leaving everyone else's downloads from the same mirror alone, so only the
84 target got the compromised version. While there's a reasonable chance
85 someone would catch a bad mirror, if a single downloader is specifically
86 targeted, unless they're specifically validating against other mirrors as
87 well and/or comparing digests (over a secure channel) against those
88 someone else downloaded, there's little chance they'd detect the
89 problem. So even digest-protected files aren't immune to compromise.
90
91 But as I said above, there's a (partial) workaround. See point #3.
92
93 3) While #1 applies to the tree in general when it is rsynced, gentoo
94 does have a somewhat higher security sync method for the paranoid and to
95 support users behind firewalls which don't pass rsync. Instead of
96 running emerge sync, this method uses the emerge-webrsync tool, which
97 downloads the entire main gentoo tree as a gpg-signed tarball. If you
98 have FEATURES=webrsync-gpg set (see the make.conf manpage, FEATURES,
99 webrsync-gpg), portage will verify the gpg signature on this tarball.
100
101 The two caveats here are (1) that the webrsync tarball is generated only
102 once per day, while the main tree is synced every few minutes, so the
103 rsynced tree is going to be more current, and (2) that each snapshot is
104 the entire tree, not just the changes, so for those updating daily or
105 close to it, fetching the full tarball every day instead of just the
106 changes will be more network traffic. Tho I think the tarball is
107 compressed (I've never tried this method personally so can't say for
108 sure) while the rsync tree isn't, so if you're updating monthly, I'd
109 guess it's less traffic to get the tarball.
110
111 The tarball is gpg-signed which is more secure than simple hash digests,
112 but the signature covers the entire thing, not individual files, so the
113 granularity of the digests is better. Additionally, the tarball signing
114 is automated, so while a signature validation pretty well ensures that
115 the tarball did indeed come from gentoo, should someone compromise gentoo
116 infrastructure security and somehow get a bad file in place, the daily
117 snapshot tarball would blindly sign and package up the bad file along
118 with all the rest.
119
120 So sync-method bottom line, if you're paranoid or simply want additional
121 gpg-signed security, use emerge-webrsync along with FEATURES=webrsync-gpg,
122 instead of normal rsync-based emerge sync. That pretty well ensures that
123 you're getting exactly the gentoo tree tarball gentoo built and signed,
124 which is certainly far more secure than normal rsync syncing, but because
125 the tarballing and signing is automated and covers the entire tree,
126 there's still the possibility that one or more files in that tarball are
127 compromised and that it hasn't been detected yet.
128
129 Meanwhile, I mentioned above that gentoo isn't as secure in this regard
130 as a number of other Linux distros. This is DEFINITELY the case for
131 normal rsync syncers, but even for webrsync-gpg syncers it remains the
132 case to some extent. Unfortunately, in practice it seems that isn't
133 likely to change in the near-term, and possibly not in the medium or
134 longer term either, unless some big gentoo compromise is detected and
135 makes the news. THEN we're likely to see changes.
136
137 Alternatively, when that big pie-in-the-sky main gentoo tree switch from
138 cvs (yes, still) to git eventually happens, the switch to full-signing
139 will be quite a bit easier, tho there will still be policies to enforce,
140 etc. But they've been talking about the switch to git for years, as
141 well, and... incrementally... drawing closer, including the fact that
142 major portions of gentoo are actually developed in git-based overlays
143 these days. But will the main tree ever actually switch to git? Who
144 knows? As of now it's still pie-in-the-sky, with no nailed down plans.
145 Perhaps at some point somebody and some gentoo council together will
146 decide it's time and move whatever mountains or molehills remain to get
147 it done, and at this point I think that's mostly what it'll take, perhaps
148 not, but unless that somebody steps up and makes that push come hell or
149 high water, assuming gentoo's still around by then, come 2025 we could
150 still be talking about doing it... someday...
151
152 Back to secure-by-policy gpg-signing...
153
154 The problem is that while we've known what must be done, and what other
155 distros have already done, for years, and while gentoo has made some
156 progress down the security road, in the absence of that ACTIVE KNOWN
157 COMPROMISE RIGHT NOW immediate threat, other things simply continue to be
158 higher priority, while REAL gentoo security continues to be back-burnered.
159
160 Basically, what must be done, thru all the way to policy enforcement and
161 refusing gentoo developer commits if they don't match policy, is enforce
162 a policy that every gentoo dev has a registered gpg key (AFAIK that much
163 is already the case), and that every commit they make is SIGNED by that
164 personal developer key, with gentoo-infra verification of those
165 signatures, rejecting any commit that doesn't verify.
166
167 FWIW, there's GLEPs detailing most of this. They've just never been
168 fully implemented, tho incrementally, bits and pieces have been, over
169 time.
170
171 As I said, other distros have done this, generally when they HAD to, when
172 they had that compromise hitting the news. Tho I think a few distros
173 have implemented such a signed-no-exceptions policy when some OTHER
174 distro got hit. Gentoo hasn't had that happen yet, and while the
175 infrastructure is generally there to sign at least individual package
176 commits, and some devs actually do so (you can see the signed digests for
177 some packages, for instance), that hasn't been enforced tree-wide, and in
178 fact, there's a few relatively minor but still important policy questions
179 to resolve first, before such enforcement is actually activated.
180
181
182 Here's one such signing-policy question to consider. Currently, package
183 maintainer devs make changes to their ebuilds, and later, after a period
184 of testing, arch-devs keyword a particular ebuild stable for their arch.
185 Occasionally arch-devs may add a bit of conditional code that applies to
186 their arch only, as well.
187
188 Now consider this. Suppose a compromised package is detected after the
189 package has been keyworded stable. The last several signed commits to
190 that package were keywording only, while the commit introducing the
191 compromise was sometime earlier.
192
193 Question: Are those arch-devs that signed their keywording-only commits
194 responsible too, because they signed off on the package, meaning they now
195 have to inspect every package they keyword, checking for compromises that
196 might not be entirely obvious to them, or are they only responsible for
197 the keywording changes they actually committed, and aren't obligated to
198 actually inspect the rest of the ebuild they're now signing?
199
200 OK, so we say that they're only responsible for the keywording. Simple
201 enough. But what about this? Suppose they add an arch-conditional that
202 combined with earlier code in the package results in a compromise. But
203 the conditional code they added looks straightforward enough on its own,
204 and really does solve a problem on that arch, and without that code, the
205 original code looks innocently functional as well. But together, anyone
206 installing that package on that arch is now open to the world. Both devs
207 signed, the code of both devs is legit and looks innocent enough on its
208 own, but taken together, they result in a bad situation. Now it's not so
209 clear that an arch-dev shouldn't have to inspect and sign for the results
210 of the package after his commit, is it? Yet enforcing that as policy
211 will seriously slow-down arch stable keywording, and some archs can't
212 keep up as it is, so such a policy will be an effective death sentence
213 for them as a gentoo-stable supported arch.
214
215 Certainly there are answers to that sort of question, and various distros
216 have faced and come up with their own policy answers, often because in
217 the face of a REAL DISTRO COMPROMISE making the news, they've had no
218 other choice. To some extent, gentoo is lucky in that it hasn't been
219 faced with making those hard choices yet. But the fact is, all gentoo
220 users remain less safe than we could be, because those hard choices
221 haven't been made and enforced... because we've not been forced to do so.
222
223
224 Meanwhile, even were we to have done so, there's still the possibility
225 that upstream development might be compromised. Every year or two, some
226 upstream project or another makes news due to some compromise or
227 another. Sometimes vulnerable versions have been distributed for awhile,
228 and various distros have picked them up. In an upstream-compromise
229 situation like that, there's little a distro can do, with the exception
230 of going slow enough that their packages are all effectively outdated,
231 which also happens to be a relatively effective counter to this sort of
232 issue since if a several years old version changes it'll be detected
233 right away, and (one hopes) most compromises to a project server will be
234 detected within months at the longest, so anything a year or more old
235 should be relatively safe from this sort of issue, simply by virtue of
236 its age.
237
238 Obviously the people and enterprise distros willing to run years outdated
239 code do have that advantage, and that's a risk that people wishing to run
240 reasonably current code simply have to take as a result of that choice,
241 regardless of the distro they chose to get that current code from.
242
243
244 But even if you choose to run an old distro so aren't likely to be hit by
245 current upstream compromises, that has and enforces a full signing policy
246 so every commit can be accounted for, and even if none of those
247 developers at either the distro or upstream levels deliberately breaks
248 the trust and goes bad, there's still the issue below...
249
250 > 2) That the compilers and interpreters don't do anything except build
251 > the code?
252
253 There's a very famous in security circles paper that effectively proves
254 that unless you can absolutely trust every single layer in the build
255 line, including the hardware layer (which means its sources) and the
256 compiler and tools used to build your operational tools, and the compiler
257 and tools used to build them, and... all the way back... you simply
258 cannot absolutely trust the results, period.
259
260 I never kept the link, but it seems the title actually stuck in memory
261 well enough for me to google it: "Reflections on Trusting Trust"
262 =:^) Here's the google link:
263
264 https://www.google.com/search?q=%22reflections+on+trusting+trust%22
265
266
267 That means that in ordered to absolutely prove the gcc (for example) on
268 our own systems, even if we can read and understand every line of gcc
269 source, we must absolutely prove the tools on the original installation
270 media and in the stage tarballs that we used to build our system. Which
271 means we must not only have the code to them and trust the builders, but
272 we must have the code and trust the builders of the tools they used, and
273 the builders and tools of those tools, and...
274
275 Meanwhile, the same rule effectively applies to the hardware as well.
276 And while Richard Stallman may run a computer that is totally open source
277 hardware and firmware (down to the BIOS or equivalent), for which he has
278 all the schemantics, etc, most of us run at least some semi-proprietary
279 hardware of /some/ sort. Which means even if we /could/ fully understand
280 the sources ourselves, without them and without that full understanding,
281 at that level, we simply have to trust... someone... basically, the
282 people who design and manufacture that hardware.
283
284 Thus, in practice, (nearly) everyone ends up drawing the line
285 /somewhere/. The Stallmans of the world draw it pretty strictly,
286 refusing to run anything which at minimum has replaceable firmware which
287 doesn't itself have sources available. (As Stallman defines it, if the
288 firmware is effectively burned in such that the manufacturer themselves
289 can't update it, then that's good enough for the line he draws. Tho that
290 leads to absurdities such as an OpenMOKO phone that at extra expense has
291 the firmware burned onto a separate chip such that it can't be replaced
292 by anyone, in ordered to be able to use hardware that would otherwise be
293 running firmware that the supplier refuses to open-source -- because the
294 extra expense to do it that way means the manufacturer can't replace the
295 firmware either, so it's on the OK side of Stallman's line.)
296
297 Meanwhile, I personally draw the line at what runs at the OS level on my
298 computer. That means I won't run proprietary graphics drivers or flash,
299 but I will and do load source-less firmware onto the Radeon-based
300 graphics hardware I do run, in ordered to use the freedomware kernel
301 drivers for the same hardware that I refuse to run the proprietary frglx
302 drivers on.
303
304 Other people are fine running flash and/or proprietary graphics drivers,
305 but won't run a mostly-proprietary full OS such as MS Windows or Apple
306 OSX.
307
308 Still others prefer to run open source where it fits their needs, but
309 won't go out of their way to do so if proprietary works better for them,
310 and still others simply don't care either way, running whatever works
311 best regardless of the freedom or lack thereof of its sources.
312
313 Anyway, when it comes to hardware and compiler, in practice the best you
314 can do is run a FLOSS compiler such as gcc, while trusting the tools you
315 used to build the first ancestor, basically, the gcc and tools in the
316 stage tarballs, as well as whatever you booted (probably either a gentoo-
317 installer or another distro) in ordered to chroot into that unpacked
318 stage and build from there. Beyond that, well... good luck, but you're
319 still going to end up drawing the line /somewhere/.
320
321 > There's certainly lots of other issues about security, like protecting
322 > passwords, protecting physical access to the network and machines, root
323 > kits and the like, etc., but assuming none of that is in question (I
324 > don't have any reason to think the NSA has been in my home!) ;-) I'm
325 > looking for info on how the code is protected from the time it's signed
326 > off until it's built and running here.
327 >
328 > If someone knows of a good web site to read on this subject let me know.
329 > I've gone through my Linux life more or less like most everyone went
330 > through life 20 years ago, but paranoia strikes deep.
331
332 Indeed. Hope the above was helpful. I think it's a pretty accurate
333 picture from at least my own perspective, as someone who cares enough
334 about it to at least spend a not insignificant amount of time keeping up
335 on the current situation in this area, both for linux in general, and for
336 gentoo in particular.
337
338 --
339 Duncan - List replies preferred. No HTML msgs.
340 "Every nonfree program has a lord, a master --
341 and if you use the program, he is your master." Richard Stallman

Replies