[gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?) - gentoo-amd64

From:	Duncan <1i5t5.duncan@×××.net>
To:	gentoo-amd64@l.g.o
Subject:	[gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
Date:	Tue, 05 Aug 2014 05:52:25
Message-Id:	`pan$d1f99$43cd96ae$721d7088$6ef9ba6d@cox.net`
In Reply to:	[gentoo-amd64] "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?) by Mark Knecht

1

Mark Knecht posted on Mon, 04 Aug 2014 15:04:12 -0700 as excerpted:

2

3

> As the line in that favorite song goes "Paranoia strikes deep"...

4

5

FWIW, while my lists sig is the proprietary-master quote from Richard 

6

Stallman below, since the (anti-)patriot bill was passed in the reaction 

7

to 9-11, my private email sig is a famous quote from Benjamin Franklin:

8

9

"They that can give up essential liberty to obtain a little

10

temporary safety, deserve neither liberty nor safety."

11

12

So "I'm with ya..."

13

14

> <NOTE>

15

> I am NOT trying to start ANY political discussion here. I hope no one

16

> will go too far down that path, at least here on this list. There are

17

> better places to do that.

18

>

19

> I am also NOT suggesting anything like what I ask next has happened,

20

> either here or elsewhere. It's just a question.

21

>

22

> Thanks in advance.

23

> </NOTE>

24

>

25

> I'm currently reading a new book by Glen Greenwald called "No Place To

26

> Hide" which is about Greenwald's introduction to Edward Snowden and the

27

> release of all of the confidential NSA documents Snowden acquired. This

28

> got me wondering about Gentoo, or even just Linux in general. If the

29

> underlying issue in all of that Snowden stuff is that the NSA has the

30

> ability to intercept and hack into whatever they please, then how do I

31

> know that the source code I build on my Gentoo machines hasn't been

32

> modified by someone to provide access to my machine, networks, etc.?

33

34

These are good questions to ask, and to have some idea of the answers to, 

35

as well.

36

37

Big picture, at some level, you pretty much have to accept that you 

38

/don't/ know.  However, there's /some/ level of security... tho honestly 

39

a bit less on Gentoo than on some of the other distros (see below), tho 

40

it'd still not be /entirely/ easy to subvert at least widely (for an 

41

individual downloader is another question), but it could be done.

42

43

> Essentially, what is the security model for all this source code and how

44

> do I verify that it hasn't been tampered with in some manner?

45

>

46

> 1) That the code I build is exactly as written and accepted by the OS

47

> community?

48

49

At a basic level, source and ebuild integrity, protecting both from 

50

accidental corruption (where it's pretty good) and from deliberate 

51

tampering (where it may or may not be considered "acceptable", but if 

52

someone with the resources wanted to bad enough, they could subvert), is 

53

what ebuild and sources digests are all about.  The idea is that the 

54

gentoo package maintainer creates hash digests of multiple types for both 

55

the ebuild and the sources, such that should the copy that a gentoo user 

56

gets not match the copy that a gentoo maintainer created, the package 

57

manager (PM, normally portage), if configured to do so (mainly 

58

FEATURES=strict, also see stricter and assume-digests, plus the webrsync-

59

gpg feature mentioned below) will error out and refuse to emerge that 

60

package.

61

62

But there are serious limits to that protection.  Here's a few points to 

63

consider:

64

65

1) While the ebuilds and sources are digested, those digests do *NOT* 

66

extend to the rest of the tree, the various files in the profile 

67

directory, the various eclasses, etc.  So in theory at least, someone 

68

could mess with say the package.mask file in profiles, or one of the 

69

eclasses, and could potentially get away with it.  But see point #3 as 

70

there's a (partial) workaround for the paranoid.

71

72

2) Meanwhile, since hashing (unlike gpg signing) isn't designed to be 

73

secure, primarily protecting against accidental damage not so much 

74

deliberate compromise, with digest verification verifying that nothing 

75

changed in transit but not who did the digest in the first place, there's 

76

some risk that one or more gentoo rsync mirrors could be compromised or 

77

be run by a bad actor in the first place.  Should that occur, the bad 

78

actor could attempt to replace BOTH the digested ebuild and/or sources 

79

AND the digest files, updating the latter to reflect his compromised 

80

version instead of the version originally digested by the gentoo 

81

maintainer.  Similarly, someone such as the NSA could at least in theory 

82

do the same thing in transit, targeting a specific user's downloads while 

83

leaving everyone else's downloads from the same mirror alone, so only the 

84

target got the compromised version.  While there's a reasonable chance 

85

someone would catch a bad mirror, if a single downloader is specifically 

86

targeted, unless they're specifically validating against other mirrors as 

87

well and/or comparing digests (over a secure channel) against those 

88

someone else downloaded, there's little chance they'd detect the 

89

problem.  So even digest-protected files aren't immune to compromise.

90

91

But as I said above, there's a (partial) workaround.  See point #3.

92

93

3) While #1 applies to the tree in general when it is rsynced, gentoo 

94

does have a somewhat higher security sync method for the paranoid and to 

95

support users behind firewalls which don't pass rsync.  Instead of 

96

running emerge sync, this method uses the emerge-webrsync tool, which 

97

downloads the entire main gentoo tree as a gpg-signed tarball.  If you 

98

have FEATURES=webrsync-gpg set (see the make.conf manpage, FEATURES, 

99

webrsync-gpg), portage will verify the gpg signature on this tarball.

100

101

The two caveats here are (1) that the webrsync tarball is generated only 

102

once per day, while the main tree is synced every few minutes, so the 

103

rsynced tree is going to be more current, and (2) that each snapshot is 

104

the entire tree, not just the changes, so for those updating daily or 

105

close to it, fetching the full tarball every day instead of just the 

106

changes will be more network traffic.  Tho I think the tarball is 

107

compressed (I've never tried this method personally so can't say for 

108

sure) while the rsync tree isn't, so if you're updating monthly, I'd 

109

guess it's less traffic to get the tarball.

110

111

The tarball is gpg-signed which is more secure than simple hash digests, 

112

but the signature covers the entire thing, not individual files, so the 

113

granularity of the digests is better.  Additionally, the tarball signing 

114

is automated, so while a signature validation pretty well ensures that 

115

the tarball did indeed come from gentoo, should someone compromise gentoo 

116

infrastructure security and somehow get a bad file in place, the daily 

117

snapshot tarball would blindly sign and package up the bad file along 

118

with all the rest.

119

120

So sync-method bottom line, if you're paranoid or simply want additional 

121

gpg-signed security, use emerge-webrsync along with FEATURES=webrsync-gpg, 

122

instead of normal rsync-based emerge sync.  That pretty well ensures that 

123

you're getting exactly the gentoo tree tarball gentoo built and signed, 

124

which is certainly far more secure than normal rsync syncing, but because 

125

the tarballing and signing is automated and covers the entire tree, 

126

there's still the possibility that one or more files in that tarball are 

127

compromised and that it hasn't been detected yet.

128

129

Meanwhile, I mentioned above that gentoo isn't as secure in this regard 

130

as a number of other Linux distros.  This is DEFINITELY the case for 

131

normal rsync syncers, but even for webrsync-gpg syncers it remains the 

132

case to some extent.  Unfortunately, in practice it seems that isn't 

133

likely to change in the near-term, and possibly not in the medium or 

134

longer term either, unless some big gentoo compromise is detected and 

135

makes the news.  THEN we're likely to see changes.

136

137

Alternatively, when that big pie-in-the-sky main gentoo tree switch from 

138

cvs (yes, still) to git eventually happens, the switch to full-signing 

139

will be quite a bit easier, tho there will still be policies to enforce, 

140

etc.  But they've been talking about the switch to git for years, as 

141

well, and... incrementally... drawing closer, including the fact that 

142

major portions of gentoo are actually developed in git-based overlays 

143

these days.  But will the main tree ever actually switch to git?  Who 

144

knows?  As of now it's still pie-in-the-sky, with no nailed down plans.  

145

Perhaps at some point somebody and some gentoo council together will 

146

decide it's time and move whatever mountains or molehills remain to get 

147

it done, and at this point I think that's mostly what it'll take, perhaps 

148

not, but unless that somebody steps up and makes that push come hell or 

149

high water, assuming gentoo's still around by then, come 2025 we could 

150

still be talking about doing it... someday...

151

152

Back to secure-by-policy gpg-signing...

153

154

The problem is that while we've known what must be done, and what other 

155

distros have already done, for years, and while gentoo has made some 

156

progress down the security road, in the absence of that ACTIVE KNOWN 

157

COMPROMISE RIGHT NOW immediate threat, other things simply continue to be 

158

higher priority, while REAL gentoo security continues to be back-burnered.

159

160

Basically, what must be done, thru all the way to policy enforcement and 

161

refusing gentoo developer commits if they don't match policy, is enforce 

162

a policy that every gentoo dev has a registered gpg key (AFAIK that much 

163

is already the case), and that every commit they make is SIGNED by that 

164

personal developer key, with gentoo-infra verification of those 

165

signatures, rejecting any commit that doesn't verify.

166

167

FWIW, there's GLEPs detailing most of this.  They've just never been 

168

fully implemented, tho incrementally, bits and pieces have been, over 

169

time.

170

171

As I said, other distros have done this, generally when they HAD to, when 

172

they had that compromise hitting the news.  Tho I think a few distros 

173

have implemented such a signed-no-exceptions policy when some OTHER 

174

distro got hit.  Gentoo hasn't had that happen yet, and while the 

175

infrastructure is generally there to sign at least individual package 

176

commits, and some devs actually do so (you can see the signed digests for 

177

some packages, for instance), that hasn't been enforced tree-wide, and in 

178

fact, there's a few relatively minor but still important policy questions 

179

to resolve first, before such enforcement is actually activated.

180

181

182

Here's one such signing-policy question to consider.  Currently, package 

183

maintainer devs make changes to their ebuilds, and later, after a period 

184

of testing, arch-devs keyword a particular ebuild stable for their arch.  

185

Occasionally arch-devs may add a bit of conditional code that applies to 

186

their arch only, as well.

187

188

Now consider this.  Suppose a compromised package is detected after the 

189

package has been keyworded stable.  The last several signed commits to 

190

that package were keywording only, while the commit introducing the 

191

compromise was sometime earlier.

192

193

Question:  Are those arch-devs that signed their keywording-only commits 

194

responsible too, because they signed off on the package, meaning they now 

195

have to inspect every package they keyword, checking for compromises that 

196

might not be entirely obvious to them, or are they only responsible for 

197

the keywording changes they actually committed, and aren't obligated to 

198

actually inspect the rest of the ebuild they're now signing?

199

200

OK, so we say that they're only responsible for the keywording.  Simple 

201

enough.  But what about this?  Suppose they add an arch-conditional that 

202

combined with earlier code in the package results in a compromise.  But 

203

the conditional code they added looks straightforward enough on its own, 

204

and really does solve a problem on that arch, and without that code, the 

205

original code looks innocently functional as well.  But together, anyone 

206

installing that package on that arch is now open to the world.  Both devs 

207

signed, the code of both devs is legit and looks innocent enough on its 

208

own, but taken together, they result in a bad situation.  Now it's not so 

209

clear that an arch-dev shouldn't have to inspect and sign for the results 

210

of the package after his commit, is it?  Yet enforcing that as policy 

211

will seriously slow-down arch stable keywording, and some archs can't 

212

keep up as it is, so such a policy will be an effective death sentence 

213

for them as a gentoo-stable supported arch.

214

215

Certainly there are answers to that sort of question, and various distros 

216

have faced and come up with their own policy answers, often because in 

217

the face of a REAL DISTRO COMPROMISE making the news, they've had no 

218

other choice.  To some extent, gentoo is lucky in that it hasn't been 

219

faced with making those hard choices yet.  But the fact is, all gentoo 

220

users remain less safe than we could be, because those hard choices 

221

haven't been made and enforced... because we've not been forced to do so.

222

223

224

Meanwhile, even were we to have done so, there's still the possibility 

225

that upstream development might be compromised.  Every year or two, some 

226

upstream project or another makes news due to some compromise or 

227

another.  Sometimes vulnerable versions have been distributed for awhile, 

228

and various distros have picked them up.  In an upstream-compromise 

229

situation like that, there's little a distro can do, with the exception 

230

of going slow enough that their packages are all effectively outdated, 

231

which also happens to be a relatively effective counter to this sort of 

232

issue since if a several years old version changes it'll be detected 

233

right away, and (one hopes) most compromises to a project server will be 

234

detected within months at the longest, so anything a year or more old 

235

should be relatively safe from this sort of issue, simply by virtue of 

236

its age.

237

238

Obviously the people and enterprise distros willing to run years outdated 

239

code do have that advantage, and that's a risk that people wishing to run 

240

reasonably current code simply have to take as a result of that choice, 

241

regardless of the distro they chose to get that current code from.

242

243

244

But even if you choose to run an old distro so aren't likely to be hit by 

245

current upstream compromises, that has and enforces a full signing policy 

246

so every commit can be accounted for, and even if none of those 

247

developers at either the distro or upstream levels deliberately breaks 

248

the trust and goes bad, there's still the issue below...

249

250

> 2) That the compilers and interpreters don't do anything except build

251

> the code?

252

253

There's a very famous in security circles paper that effectively proves 

254

that unless you can absolutely trust every single layer in the build 

255

line, including the hardware layer (which means its sources) and the 

256

compiler and tools used to build your operational tools, and the compiler 

257

and tools used to build them, and... all the way back... you simply 

258

cannot absolutely trust the results, period.

259

260

I never kept the link, but it seems the title actually stuck in memory 

261

well enough for me to google it: "Reflections on Trusting Trust"

262

=:^)  Here's the google link:

263

264

https://www.google.com/search?q=%22reflections+on+trusting+trust%22

265

266

267

That means that in ordered to absolutely prove the gcc (for example) on 

268

our own systems, even if we can read and understand every line of gcc 

269

source, we must absolutely prove the tools on the original installation 

270

media and in the stage tarballs that we used to build our system.  Which 

271

means we must not only have the code to them and trust the builders, but 

272

we must have the code and trust the builders of the tools they used, and 

273

the builders and tools of those tools, and...

274

275

Meanwhile, the same rule effectively applies to the hardware as well.  

276

And while Richard Stallman may run a computer that is totally open source 

277

hardware and firmware (down to the BIOS or equivalent), for which he has 

278

all the schemantics, etc, most of us run at least some semi-proprietary 

279

hardware of /some/ sort.  Which means even if we /could/ fully understand 

280

the sources ourselves, without them and without that full understanding, 

281

at that level, we simply have to trust... someone... basically, the 

282

people who design and manufacture that hardware.

283

284

Thus, in practice, (nearly) everyone ends up drawing the line

285

/somewhere/.  The Stallmans of the world draw it pretty strictly, 

286

refusing to run anything which at minimum has replaceable firmware which 

287

doesn't itself have sources available.  (As Stallman defines it, if the 

288

firmware is effectively burned in such that the manufacturer themselves 

289

can't update it, then that's good enough for the line he draws.  Tho that 

290

leads to absurdities such as an OpenMOKO phone that at extra expense has 

291

the firmware burned onto a separate chip such that it can't be replaced 

292

by anyone, in ordered to be able to use hardware that would otherwise be 

293

running firmware that the supplier refuses to open-source -- because the 

294

extra expense to do it that way means the manufacturer can't replace the 

295

firmware either, so it's on the OK side of Stallman's line.)

296

297

Meanwhile, I personally draw the line at what runs at the OS level on my 

298

computer.  That means I won't run proprietary graphics drivers or flash, 

299

but I will and do load source-less firmware onto the Radeon-based 

300

graphics hardware I do run, in ordered to use the freedomware kernel 

301

drivers for the same hardware that I refuse to run the proprietary frglx 

302

drivers on.

303

304

Other people are fine running flash and/or proprietary graphics drivers, 

305

but won't run a mostly-proprietary full OS such as MS Windows or Apple 

306

OSX.

307

308

Still others prefer to run open source where it fits their needs, but 

309

won't go out of their way to do so if proprietary works better for them, 

310

and still others simply don't care either way, running whatever works 

311

best regardless of the freedom or lack thereof of its sources.

312

313

Anyway, when it comes to hardware and compiler, in practice the best you 

314

can do is run a FLOSS compiler such as gcc, while trusting the tools you 

315

used to build the first ancestor, basically, the gcc and tools in the 

316

stage tarballs, as well as whatever you booted (probably either a gentoo-

317

installer or another distro) in ordered to chroot into that unpacked 

318

stage and build from there.  Beyond that, well... good luck, but you're 

319

still going to end up drawing the line /somewhere/.

320

321

> There's certainly lots of other issues about security, like protecting

322

> passwords, protecting physical access to the network and machines, root

323

> kits and the like, etc., but assuming none of that is in question (I

324

> don't have any reason to think the NSA has been in my home!) ;-) I'm

325

> looking for info on how the code is protected from the time it's signed

326

> off until it's built and running here.

327

>

328

> If someone knows of a good web site to read on this subject let me know.

329

> I've gone through my Linux life more or less like most everyone went

330

> through life 20 years ago, but paranoia strikes deep.

331

332

Indeed.  Hope the above was helpful.  I think it's a pretty accurate 

333

picture from at least my own perspective, as someone who cares enough 

334

about it to at least spend a not insignificant amount of time keeping up 

335

on the current situation in this area, both for linux in general, and for 

336

gentoo in particular.

337

338

--

339

Duncan - List replies preferred.   No HTML msgs.

340

"Every nonfree program has a lord, a master --

341

and if you use the program, he is your master."  Richard Stallman

Subject	Author
Re: [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)	Mark Knecht <markknecht@×××××.com>
Re: [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)	Mark Knecht <markknecht@×××××.com>

Gentoo Archives: gentoo-amd64

Replies

1	Mark Knecht posted on Mon, 04 Aug 2014 15:04:12 -0700 as excerpted:
2
3	> As the line in that favorite song goes "Paranoia strikes deep"...
4
5	FWIW, while my lists sig is the proprietary-master quote from Richard
6	Stallman below, since the (anti-)patriot bill was passed in the reaction
7	to 9-11, my private email sig is a famous quote from Benjamin Franklin:
8
9	"They that can give up essential liberty to obtain a little
10	temporary safety, deserve neither liberty nor safety."
11
12	So "I'm with ya..."
13
14	> <NOTE>
15	> I am NOT trying to start ANY political discussion here. I hope no one
16	> will go too far down that path, at least here on this list. There are
17	> better places to do that.
18	>
19	> I am also NOT suggesting anything like what I ask next has happened,
20	> either here or elsewhere. It's just a question.
21	>
22	> Thanks in advance.
23	> </NOTE>
24	>
25	> I'm currently reading a new book by Glen Greenwald called "No Place To
26	> Hide" which is about Greenwald's introduction to Edward Snowden and the
27	> release of all of the confidential NSA documents Snowden acquired. This
28	> got me wondering about Gentoo, or even just Linux in general. If the
29	> underlying issue in all of that Snowden stuff is that the NSA has the
30	> ability to intercept and hack into whatever they please, then how do I
31	> know that the source code I build on my Gentoo machines hasn't been
32	> modified by someone to provide access to my machine, networks, etc.?
33
34	These are good questions to ask, and to have some idea of the answers to,
35	as well.
36
37	Big picture, at some level, you pretty much have to accept that you
38	/don't/ know. However, there's /some/ level of security... tho honestly
39	a bit less on Gentoo than on some of the other distros (see below), tho
40	it'd still not be /entirely/ easy to subvert at least widely (for an
41	individual downloader is another question), but it could be done.
42
43	> Essentially, what is the security model for all this source code and how
44	> do I verify that it hasn't been tampered with in some manner?
45	>
46	> 1) That the code I build is exactly as written and accepted by the OS
47	> community?
48
49	At a basic level, source and ebuild integrity, protecting both from
50	accidental corruption (where it's pretty good) and from deliberate
51	tampering (where it may or may not be considered "acceptable", but if
52	someone with the resources wanted to bad enough, they could subvert), is
53	what ebuild and sources digests are all about. The idea is that the
54	gentoo package maintainer creates hash digests of multiple types for both
55	the ebuild and the sources, such that should the copy that a gentoo user
56	gets not match the copy that a gentoo maintainer created, the package
57	manager (PM, normally portage), if configured to do so (mainly
58	FEATURES=strict, also see stricter and assume-digests, plus the webrsync-
59	gpg feature mentioned below) will error out and refuse to emerge that
60	package.
61
62	But there are serious limits to that protection. Here's a few points to
63	consider:
64
65	1) While the ebuilds and sources are digested, those digests do NOT
66	extend to the rest of the tree, the various files in the profile
67	directory, the various eclasses, etc. So in theory at least, someone
68	could mess with say the package.mask file in profiles, or one of the
69	eclasses, and could potentially get away with it. But see point #3 as
70	there's a (partial) workaround for the paranoid.
71
72	2) Meanwhile, since hashing (unlike gpg signing) isn't designed to be
73	secure, primarily protecting against accidental damage not so much
74	deliberate compromise, with digest verification verifying that nothing
75	changed in transit but not who did the digest in the first place, there's
76	some risk that one or more gentoo rsync mirrors could be compromised or
77	be run by a bad actor in the first place. Should that occur, the bad
78	actor could attempt to replace BOTH the digested ebuild and/or sources
79	AND the digest files, updating the latter to reflect his compromised
80	version instead of the version originally digested by the gentoo
81	maintainer. Similarly, someone such as the NSA could at least in theory
82	do the same thing in transit, targeting a specific user's downloads while
83	leaving everyone else's downloads from the same mirror alone, so only the
84	target got the compromised version. While there's a reasonable chance
85	someone would catch a bad mirror, if a single downloader is specifically
86	targeted, unless they're specifically validating against other mirrors as
87	well and/or comparing digests (over a secure channel) against those
88	someone else downloaded, there's little chance they'd detect the
89	problem. So even digest-protected files aren't immune to compromise.
90
91	But as I said above, there's a (partial) workaround. See point #3.
92
93	3) While #1 applies to the tree in general when it is rsynced, gentoo
94	does have a somewhat higher security sync method for the paranoid and to
95	support users behind firewalls which don't pass rsync. Instead of
96	running emerge sync, this method uses the emerge-webrsync tool, which
97	downloads the entire main gentoo tree as a gpg-signed tarball. If you
98	have FEATURES=webrsync-gpg set (see the make.conf manpage, FEATURES,
99	webrsync-gpg), portage will verify the gpg signature on this tarball.
100
101	The two caveats here are (1) that the webrsync tarball is generated only
102	once per day, while the main tree is synced every few minutes, so the
103	rsynced tree is going to be more current, and (2) that each snapshot is
104	the entire tree, not just the changes, so for those updating daily or
105	close to it, fetching the full tarball every day instead of just the
106	changes will be more network traffic. Tho I think the tarball is
107	compressed (I've never tried this method personally so can't say for
108	sure) while the rsync tree isn't, so if you're updating monthly, I'd
109	guess it's less traffic to get the tarball.
110
111	The tarball is gpg-signed which is more secure than simple hash digests,
112	but the signature covers the entire thing, not individual files, so the
113	granularity of the digests is better. Additionally, the tarball signing
114	is automated, so while a signature validation pretty well ensures that
115	the tarball did indeed come from gentoo, should someone compromise gentoo
116	infrastructure security and somehow get a bad file in place, the daily
117	snapshot tarball would blindly sign and package up the bad file along
118	with all the rest.
119
120	So sync-method bottom line, if you're paranoid or simply want additional
121	gpg-signed security, use emerge-webrsync along with FEATURES=webrsync-gpg,
122	instead of normal rsync-based emerge sync. That pretty well ensures that
123	you're getting exactly the gentoo tree tarball gentoo built and signed,
124	which is certainly far more secure than normal rsync syncing, but because
125	the tarballing and signing is automated and covers the entire tree,
126	there's still the possibility that one or more files in that tarball are
127	compromised and that it hasn't been detected yet.
128
129	Meanwhile, I mentioned above that gentoo isn't as secure in this regard
130	as a number of other Linux distros. This is DEFINITELY the case for
131	normal rsync syncers, but even for webrsync-gpg syncers it remains the
132	case to some extent. Unfortunately, in practice it seems that isn't
133	likely to change in the near-term, and possibly not in the medium or
134	longer term either, unless some big gentoo compromise is detected and
135	makes the news. THEN we're likely to see changes.
136
137	Alternatively, when that big pie-in-the-sky main gentoo tree switch from
138	cvs (yes, still) to git eventually happens, the switch to full-signing
139	will be quite a bit easier, tho there will still be policies to enforce,
140	etc. But they've been talking about the switch to git for years, as
141	well, and... incrementally... drawing closer, including the fact that
142	major portions of gentoo are actually developed in git-based overlays
143	these days. But will the main tree ever actually switch to git? Who
144	knows? As of now it's still pie-in-the-sky, with no nailed down plans.
145	Perhaps at some point somebody and some gentoo council together will
146	decide it's time and move whatever mountains or molehills remain to get
147	it done, and at this point I think that's mostly what it'll take, perhaps
148	not, but unless that somebody steps up and makes that push come hell or
149	high water, assuming gentoo's still around by then, come 2025 we could
150	still be talking about doing it... someday...
151
152	Back to secure-by-policy gpg-signing...
153
154	The problem is that while we've known what must be done, and what other
155	distros have already done, for years, and while gentoo has made some
156	progress down the security road, in the absence of that ACTIVE KNOWN
157	COMPROMISE RIGHT NOW immediate threat, other things simply continue to be
158	higher priority, while REAL gentoo security continues to be back-burnered.
159
160	Basically, what must be done, thru all the way to policy enforcement and
161	refusing gentoo developer commits if they don't match policy, is enforce
162	a policy that every gentoo dev has a registered gpg key (AFAIK that much
163	is already the case), and that every commit they make is SIGNED by that
164	personal developer key, with gentoo-infra verification of those
165	signatures, rejecting any commit that doesn't verify.
166
167	FWIW, there's GLEPs detailing most of this. They've just never been
168	fully implemented, tho incrementally, bits and pieces have been, over
169	time.
170
171	As I said, other distros have done this, generally when they HAD to, when
172	they had that compromise hitting the news. Tho I think a few distros
173	have implemented such a signed-no-exceptions policy when some OTHER
174	distro got hit. Gentoo hasn't had that happen yet, and while the
175	infrastructure is generally there to sign at least individual package
176	commits, and some devs actually do so (you can see the signed digests for
177	some packages, for instance), that hasn't been enforced tree-wide, and in
178	fact, there's a few relatively minor but still important policy questions
179	to resolve first, before such enforcement is actually activated.
180
181
182	Here's one such signing-policy question to consider. Currently, package
183	maintainer devs make changes to their ebuilds, and later, after a period
184	of testing, arch-devs keyword a particular ebuild stable for their arch.
185	Occasionally arch-devs may add a bit of conditional code that applies to
186	their arch only, as well.
187
188	Now consider this. Suppose a compromised package is detected after the
189	package has been keyworded stable. The last several signed commits to
190	that package were keywording only, while the commit introducing the
191	compromise was sometime earlier.
192
193	Question: Are those arch-devs that signed their keywording-only commits
194	responsible too, because they signed off on the package, meaning they now
195	have to inspect every package they keyword, checking for compromises that
196	might not be entirely obvious to them, or are they only responsible for
197	the keywording changes they actually committed, and aren't obligated to
198	actually inspect the rest of the ebuild they're now signing?
199
200	OK, so we say that they're only responsible for the keywording. Simple
201	enough. But what about this? Suppose they add an arch-conditional that
202	combined with earlier code in the package results in a compromise. But
203	the conditional code they added looks straightforward enough on its own,
204	and really does solve a problem on that arch, and without that code, the
205	original code looks innocently functional as well. But together, anyone
206	installing that package on that arch is now open to the world. Both devs
207	signed, the code of both devs is legit and looks innocent enough on its
208	own, but taken together, they result in a bad situation. Now it's not so
209	clear that an arch-dev shouldn't have to inspect and sign for the results
210	of the package after his commit, is it? Yet enforcing that as policy
211	will seriously slow-down arch stable keywording, and some archs can't
212	keep up as it is, so such a policy will be an effective death sentence
213	for them as a gentoo-stable supported arch.
214
215	Certainly there are answers to that sort of question, and various distros
216	have faced and come up with their own policy answers, often because in
217	the face of a REAL DISTRO COMPROMISE making the news, they've had no
218	other choice. To some extent, gentoo is lucky in that it hasn't been
219	faced with making those hard choices yet. But the fact is, all gentoo
220	users remain less safe than we could be, because those hard choices
221	haven't been made and enforced... because we've not been forced to do so.
222
223
224	Meanwhile, even were we to have done so, there's still the possibility
225	that upstream development might be compromised. Every year or two, some
226	upstream project or another makes news due to some compromise or
227	another. Sometimes vulnerable versions have been distributed for awhile,
228	and various distros have picked them up. In an upstream-compromise
229	situation like that, there's little a distro can do, with the exception
230	of going slow enough that their packages are all effectively outdated,
231	which also happens to be a relatively effective counter to this sort of
232	issue since if a several years old version changes it'll be detected
233	right away, and (one hopes) most compromises to a project server will be
234	detected within months at the longest, so anything a year or more old
235	should be relatively safe from this sort of issue, simply by virtue of
236	its age.
237
238	Obviously the people and enterprise distros willing to run years outdated
239	code do have that advantage, and that's a risk that people wishing to run
240	reasonably current code simply have to take as a result of that choice,
241	regardless of the distro they chose to get that current code from.
242
243
244	But even if you choose to run an old distro so aren't likely to be hit by
245	current upstream compromises, that has and enforces a full signing policy
246	so every commit can be accounted for, and even if none of those
247	developers at either the distro or upstream levels deliberately breaks
248	the trust and goes bad, there's still the issue below...
249
250	> 2) That the compilers and interpreters don't do anything except build
251	> the code?
252
253	There's a very famous in security circles paper that effectively proves
254	that unless you can absolutely trust every single layer in the build
255	line, including the hardware layer (which means its sources) and the
256	compiler and tools used to build your operational tools, and the compiler
257	and tools used to build them, and... all the way back... you simply
258	cannot absolutely trust the results, period.
259
260	I never kept the link, but it seems the title actually stuck in memory
261	well enough for me to google it: "Reflections on Trusting Trust"
262	=:^) Here's the google link:
263
264	https://www.google.com/search?q=%22reflections+on+trusting+trust%22
265
266
267	That means that in ordered to absolutely prove the gcc (for example) on
268	our own systems, even if we can read and understand every line of gcc
269	source, we must absolutely prove the tools on the original installation
270	media and in the stage tarballs that we used to build our system. Which
271	means we must not only have the code to them and trust the builders, but
272	we must have the code and trust the builders of the tools they used, and
273	the builders and tools of those tools, and...
274
275	Meanwhile, the same rule effectively applies to the hardware as well.
276	And while Richard Stallman may run a computer that is totally open source
277	hardware and firmware (down to the BIOS or equivalent), for which he has
278	all the schemantics, etc, most of us run at least some semi-proprietary
279	hardware of /some/ sort. Which means even if we /could/ fully understand
280	the sources ourselves, without them and without that full understanding,
281	at that level, we simply have to trust... someone... basically, the
282	people who design and manufacture that hardware.
283
284	Thus, in practice, (nearly) everyone ends up drawing the line
285	/somewhere/. The Stallmans of the world draw it pretty strictly,
286	refusing to run anything which at minimum has replaceable firmware which
287	doesn't itself have sources available. (As Stallman defines it, if the
288	firmware is effectively burned in such that the manufacturer themselves
289	can't update it, then that's good enough for the line he draws. Tho that
290	leads to absurdities such as an OpenMOKO phone that at extra expense has
291	the firmware burned onto a separate chip such that it can't be replaced
292	by anyone, in ordered to be able to use hardware that would otherwise be
293	running firmware that the supplier refuses to open-source -- because the
294	extra expense to do it that way means the manufacturer can't replace the
295	firmware either, so it's on the OK side of Stallman's line.)
296
297	Meanwhile, I personally draw the line at what runs at the OS level on my
298	computer. That means I won't run proprietary graphics drivers or flash,
299	but I will and do load source-less firmware onto the Radeon-based
300	graphics hardware I do run, in ordered to use the freedomware kernel
301	drivers for the same hardware that I refuse to run the proprietary frglx
302	drivers on.
303
304	Other people are fine running flash and/or proprietary graphics drivers,
305	but won't run a mostly-proprietary full OS such as MS Windows or Apple
306	OSX.
307
308	Still others prefer to run open source where it fits their needs, but
309	won't go out of their way to do so if proprietary works better for them,
310	and still others simply don't care either way, running whatever works
311	best regardless of the freedom or lack thereof of its sources.
312
313	Anyway, when it comes to hardware and compiler, in practice the best you
314	can do is run a FLOSS compiler such as gcc, while trusting the tools you
315	used to build the first ancestor, basically, the gcc and tools in the
316	stage tarballs, as well as whatever you booted (probably either a gentoo-
317	installer or another distro) in ordered to chroot into that unpacked
318	stage and build from there. Beyond that, well... good luck, but you're
319	still going to end up drawing the line /somewhere/.
320
321	> There's certainly lots of other issues about security, like protecting
322	> passwords, protecting physical access to the network and machines, root
323	> kits and the like, etc., but assuming none of that is in question (I
324	> don't have any reason to think the NSA has been in my home!) ;-) I'm
325	> looking for info on how the code is protected from the time it's signed
326	> off until it's built and running here.
327	>
328	> If someone knows of a good web site to read on this subject let me know.
329	> I've gone through my Linux life more or less like most everyone went
330	> through life 20 years ago, but paranoia strikes deep.
331
332	Indeed. Hope the above was helpful. I think it's a pretty accurate
333	picture from at least my own perspective, as someone who cares enough
334	about it to at least spend a not insignificant amount of time keeping up
335	on the current situation in this area, both for linux in general, and for
336	gentoo in particular.
337
338	--
339	Duncan - List replies preferred. No HTML msgs.
340	"Every nonfree program has a lord, a master --
341	and if you use the program, he is your master." Richard Stallman