1 |
The topic of the usefulness of various -Ox optimizations comes up every so |
2 |
often. This will likely be old news to Gentoo old hands, but it |
3 |
should be useful for newbies, at least. Anyway, I found this commentary |
4 |
by Linus interesting: |
5 |
|
6 |
http://permalink.gmane.org/gmane.linux.kernel/344440 |
7 |
|
8 |
The thread wanders quite a bit, but this subthread started with someone |
9 |
displaying a listing of their kernels ordered by size back thru 2.2.x, |
10 |
noting that the size had increased from ~ half a MB back then to ~ 1.5 MB |
11 |
today. |
12 |
|
13 |
Somebody then asked (among other things) if the compiler used was the |
14 |
same, listing the size of the /same/ kernel compiled with gcc-2.95.x |
15 |
against the gcc-4.1.0-snapshot they are running. The 4.1 compiled kernel |
16 |
was MUCH larger. |
17 |
|
18 |
Someone replied that the comparison wasn't fair -- if you don't ask gcc to |
19 |
optimize for size, don't blame it for not doing so -- and proceeded to |
20 |
compare kernels compiled with 4.1 and normal options, against |
21 |
those compiled using the kernel embedded option, and then choosing |
22 |
appropriately the answers to the output of "make oldconfig" after having |
23 |
changed to "embedded". Among other things, he added -Os to the kernel's |
24 |
gcc command line, and he probably dropped symbols as well. What else he |
25 |
chose from the options he didn't say, but the resulting core kernel |
26 |
(allnoconfig) ended up /very/ similar in size to the old 2.5 kernel he |
27 |
compared against (I believe the first one to have allnoconfig as an option). |
28 |
|
29 |
A few posts down, Linus then posted this comment, from the bottom of the |
30 |
linked post above: |
31 |
|
32 |
<quote> |
33 |
|
34 |
And we should probably make -Os the default. Apparently Fedora |
35 |
already does that by just forcibly hacking the Kconfig files. |
36 |
|
37 |
With modern CPU's, instructions are almost "free". The real cost is in |
38 |
cache misses, and that tends to be doubly true of system software that |
39 |
tends to have a lot more cache misses than "normal" programs (because |
40 |
people try hard to batch up system calls like write etc, so by the time |
41 |
the kernel is called, the L1 cache is mostly flushed already - possibly |
42 |
the L2 too. And interrupts may be in the "fast path", but they'd sure as |
43 |
hell better not happen so often that they stay cached very well etc etc). |
44 |
|
45 |
So -Os probably performs better in real life, and likely only performs |
46 |
worse on micro-benchmarks. Sadly, micro-benchmarks are often very |
47 |
instructive in many other ways. |
48 |
|
49 |
</quote> |
50 |
|
51 |
(Dave Jones later confirms that Fedora does indeed normally run -Os.) |
52 |
|
53 |
(If you want to view the entire thread, use the "Go to the topic" link |
54 |
near the top left of the page.) |
55 |
|
56 |
So... certainly for kernel and probably for glibc stuff (tho I believe |
57 |
Gentoo kills -Os on glibc compiles unless you hack out that portion of the |
58 |
ebuild, in your own overlay or whatever), -Os is likely to be the best |
59 |
choice. For most of userland, -Os may be best, but to a smaller degree, |
60 |
and performance should be similar with -O2, trading size for medium speed |
61 |
optimizations in what amounts to a wash. The exceptions would likely be |
62 |
media encoders and the like, where the working set is large and in a data |
63 |
streaming environment, and -O3 may make sense. In the general case, |
64 |
however, -O3 likely does NOT make sense, because it's almost always so |
65 |
expensive in size that the gains in speed over -O2 are far outweighed. |
66 |
|
67 |
IMO (and for the kernel anyway, Linus's as well)... |
68 |
|
69 |
-- |
70 |
Duncan - List replies preferred. No HTML msgs. |
71 |
"Every nonfree program has a lord, a master -- |
72 |
and if you use the program, he is your master." Richard Stallman in |
73 |
http://www.linuxdevcenter.com/pub/a/linux/2004/12/22/rms_interview.html |
74 |
|
75 |
|
76 |
-- |
77 |
gentoo-amd64@g.o mailing list |