1 |
Lie Ryan posted on Fri, 08 Aug 2014 02:06:14 +1000 as excerpted: |
2 |
|
3 |
> With you having to compile thousands of stuffs if you build from stage |
4 |
> 1, I doubt that you will be able to verify every single thing you |
5 |
> compile and detect if something is actually doing sneaky stuff AND still |
6 |
> have the time to enjoy your system. Also, even if you build from stage 1 |
7 |
> and manage to verify all the source code, you still need to download a |
8 |
> precompiled compiler which could possibly inject the malicious code into |
9 |
> the programs it compiles, and which can also inject itself if you try to |
10 |
> compile another compiler from source. If there is a single software that |
11 |
> is worth a gold mine to inject with malware to gain illicit access to |
12 |
> all Linux system, then it would be gcc. Once you infect a compiler, |
13 |
> you're invincible. |
14 |
|
15 |
Actually, that brings up a good question. The art of compiling is |
16 |
certainly somewhat magic to me tho I guess I somewhat understand the |
17 |
concept in a vague, handwavy way, but... |
18 |
|
19 |
From my understanding, that's one reason why the gcc build is multi-stage |
20 |
and uses simpler (and thus easier to audit) tools such as lex and bison |
21 |
in its bootstrapping process. I'm not actually sure whether gcc actually |
22 |
requires a previous gcc (or other full compiler) to build or not, but I |
23 |
do know it goes to quite some lengths to bootstrap in multiple stages, |
24 |
building things up from the simple to the complex as it goes and testing |
25 |
each stage in the process so that if something goes wrong, there's some |
26 |
idea /where/ it went wrong. |
27 |
|
28 |
Clearly one major reason for that is proving functionality at each step |
29 |
such that if the process goes wrong, there's some place to start as to |
30 |
why and how, but it certainly doesn't hurt in helping to prove or at |
31 |
least somewhat establish the basic security situation either, tho as |
32 |
we've already established, it's basically impossible to prove both the |
33 |
hardware and the software back thru all the multiple generations. |
34 |
|
35 |
Of course the simpler tools, lex, bison, etc, must have been built from |
36 |
something, but because they /are/ simpler, they're also easier to audit |
37 |
and prove basic functionality, including disassembly and analysis of |
38 |
individual machine instructions for a fuller audit. |
39 |
|
40 |
So anyway, to the gcc experts that know, and to non-gcc CS folks who have |
41 |
actually built their own simple compilers and can at least address the |
42 |
concept, is a previous gcc or other full compiler actually required to |
43 |
build a new gcc, or does it sufficiently bootstrap itself from the more |
44 |
basic tools such that unlike most code, it doesn't actually need a full |
45 |
compiler to build and reasonably optimize at all? That's a question I've |
46 |
had brewing in the back of my mind for some time, and this seemed the |
47 |
perfect opportunity to ask it. =:^) |
48 |
|
49 |
Meanwhile, I suppose it must be possible at least at some level, else how |
50 |
would new hardware archs come to be supported. Gotta start /somewhere/ |
51 |
on the toolchain, and "simpler" stuff like lex and bison can I believe |
52 |
run on a previous arch, generating the basic executable building blocks |
53 |
that ultimately become the first executable code actually run by the new |
54 |
target arch. |
55 |
|
56 |
And of course gcc has long been one of the most widely arch-supporting |
57 |
compilers, precisely because it /is/ open source and /is/ designed to be |
58 |
bootstrapped in stages like that. I guess clang/llvm is giving gcc some |
59 |
competition in that area now, in part because it's more modern and |
60 |
modular and in part because unlike gcc it /can/ legally be taken private |
61 |
and supplied to others without offering sources and some companies are |
62 |
evil that way, but gcc's the one with the long history in that area, and |
63 |
given that history I'd guess it'll be some time before clang/llvm catches |
64 |
up, even if it's getting most of the new platforms right now, which I've |
65 |
no idea whether it's the case or not. |
66 |
|
67 |
-- |
68 |
Duncan - List replies preferred. No HTML msgs. |
69 |
"Every nonfree program has a lord, a master -- |
70 |
and if you use the program, he is your master." Richard Stallman |