1 |
I'm curious as to how many of you have considered and tried using -Os |
2 |
for your programs, and it might even apply to parts of the kernel. |
3 |
|
4 |
I'm not kidding. On half of my C++ programs, usually those that deal |
5 |
with mostly hard processing, there is no discernable difference |
6 |
between -O2 or -O3 and -Os, except that -Os executables are smaller, |
7 |
and every now and then a benchmark (though that tends to be 'do |
8 |
something a lot of times in a row', which of course is mostly a |
9 |
nonsense benchmark) will turn out faster - I have yet to see a case in |
10 |
which -Os performs noticeablyworse - and most of the time it's about |
11 |
the same, although that might just be the sort of things I use C for. |
12 |
|
13 |
That code size in itself seems to almost completely balance the |
14 |
optimisations gcc can do (...not that -Os has no optimisations at |
15 |
all), especially for what in practice are my small, central object |
16 |
files gave me a whole new perspective on how much more central the |
17 |
cache is to speed these days. I think many more programs could |
18 |
benefit, actually. |
19 |
|
20 |
For the heck of it, I just compiled md5 with it standard O3 and with |
21 |
Os. It made a tiny yet |
22 |
of 150 to 200 ms in favour of Os. (on a a scale of 26 seconds; I used |
23 |
kcore, so caching should have had no effect [actually, on a 100MB file |
24 |
in shm, well within cacheability, O3 was faster, by abotu the same |
25 |
amont, on a 3 second scale (I did this on a slower processor, |
26 |
'course)]) But md5 isn't the best example, as it's data crunching, and |
27 |
not very complex code. I'ld be amazed if it weren't completely cached |
28 |
either way - the seven hundred bytes or so in executable size |
29 |
difference (scale of 13K) saved there aren't the most important ever - |
30 |
except that can be taken to be purely in code, which is partly the |
31 |
point. |
32 |
|
33 |
So I'm even more interested now. It would amuse me if you would humour |
34 |
me for a few minutes and compile and run a few of your own C programs |
35 |
(that don't bottleneck on syscalls or io) with Os, see whether it |
36 |
makes a difference, and in what direction and magnitude. |
37 |
|
38 |
Just for kicks, I just compiled my kernel (2.6.7 ck) with Os. It's a |
39 |
shame it'd hard to measure the speed difference, but it does knock |
40 |
300K (~12%) off its size. I think I'll run it for a while, see what |
41 |
happens. |
42 |
|
43 |
--Bart Alewijnse |
44 |
|
45 |
-- |
46 |
gentoo-performance@g.o mailing list |