Gentoo Archives: gentoo-amd64

From: Duncan <1i5t5.duncan@×××.net>
To: gentoo-amd64@l.g.o
Subject: [gentoo-amd64] Re: "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?)
Date: Thu, 07 Aug 2014 17:20:19
Message-Id: pan$1802$ad31072b$2b88fdcb$9b9112da@cox.net
In Reply to: Re: [gentoo-amd64] "For What It's Worth" (or How do I know my Gentoo source code hasn't been messed with?) by Lie Ryan
1 Lie Ryan posted on Fri, 08 Aug 2014 02:06:14 +1000 as excerpted:
2
3 > With you having to compile thousands of stuffs if you build from stage
4 > 1, I doubt that you will be able to verify every single thing you
5 > compile and detect if something is actually doing sneaky stuff AND still
6 > have the time to enjoy your system. Also, even if you build from stage 1
7 > and manage to verify all the source code, you still need to download a
8 > precompiled compiler which could possibly inject the malicious code into
9 > the programs it compiles, and which can also inject itself if you try to
10 > compile another compiler from source. If there is a single software that
11 > is worth a gold mine to inject with malware to gain illicit access to
12 > all Linux system, then it would be gcc. Once you infect a compiler,
13 > you're invincible.
14
15 Actually, that brings up a good question. The art of compiling is
16 certainly somewhat magic to me tho I guess I somewhat understand the
17 concept in a vague, handwavy way, but...
18
19 From my understanding, that's one reason why the gcc build is multi-stage
20 and uses simpler (and thus easier to audit) tools such as lex and bison
21 in its bootstrapping process. I'm not actually sure whether gcc actually
22 requires a previous gcc (or other full compiler) to build or not, but I
23 do know it goes to quite some lengths to bootstrap in multiple stages,
24 building things up from the simple to the complex as it goes and testing
25 each stage in the process so that if something goes wrong, there's some
26 idea /where/ it went wrong.
27
28 Clearly one major reason for that is proving functionality at each step
29 such that if the process goes wrong, there's some place to start as to
30 why and how, but it certainly doesn't hurt in helping to prove or at
31 least somewhat establish the basic security situation either, tho as
32 we've already established, it's basically impossible to prove both the
33 hardware and the software back thru all the multiple generations.
34
35 Of course the simpler tools, lex, bison, etc, must have been built from
36 something, but because they /are/ simpler, they're also easier to audit
37 and prove basic functionality, including disassembly and analysis of
38 individual machine instructions for a fuller audit.
39
40 So anyway, to the gcc experts that know, and to non-gcc CS folks who have
41 actually built their own simple compilers and can at least address the
42 concept, is a previous gcc or other full compiler actually required to
43 build a new gcc, or does it sufficiently bootstrap itself from the more
44 basic tools such that unlike most code, it doesn't actually need a full
45 compiler to build and reasonably optimize at all? That's a question I've
46 had brewing in the back of my mind for some time, and this seemed the
47 perfect opportunity to ask it. =:^)
48
49 Meanwhile, I suppose it must be possible at least at some level, else how
50 would new hardware archs come to be supported. Gotta start /somewhere/
51 on the toolchain, and "simpler" stuff like lex and bison can I believe
52 run on a previous arch, generating the basic executable building blocks
53 that ultimately become the first executable code actually run by the new
54 target arch.
55
56 And of course gcc has long been one of the most widely arch-supporting
57 compilers, precisely because it /is/ open source and /is/ designed to be
58 bootstrapped in stages like that. I guess clang/llvm is giving gcc some
59 competition in that area now, in part because it's more modern and
60 modular and in part because unlike gcc it /can/ legally be taken private
61 and supplied to others without offering sources and some companies are
62 evil that way, but gcc's the one with the long history in that area, and
63 given that history I'd guess it'll be some time before clang/llvm catches
64 up, even if it's getting most of the new platforms right now, which I've
65 no idea whether it's the case or not.
66
67 --
68 Duncan - List replies preferred. No HTML msgs.
69 "Every nonfree program has a lord, a master --
70 and if you use the program, he is your master." Richard Stallman

Replies