Gentoo Archives: gentoo-performance

From: Bart Alewijnse <scarfboy@×××××.com>
To: gentoo-performance@l.g.o
Subject: Re: [gentoo-performance] inline considered harmful
Date: Thu, 22 Jul 2004 14:35:15
In Reply to: Re: [gentoo-performance] inline considered harmful by "Ervin Németh"
I'm curious as to how many of you have considered and tried using -Os
for your programs, and it might even apply to parts of the kernel.

I'm not kidding. On half of my C++ programs, usually those that deal
with mostly hard processing, there is no discernable difference
between -O2 or -O3 and -Os, except that -Os executables are smaller,
and every now and then a benchmark (though that tends to be 'do
something a lot of times in a row', which of course is mostly a
nonsense benchmark) will turn out faster - I have yet to see a case in
which -Os performs noticeablyworse - and most of the time it's about
the same, although that might just be the sort of things I use C for.

That code size in itself seems to almost completely balance the
optimisations gcc can do (...not that -Os has no optimisations at
all), especially for what in practice are my small, central object
files gave me a whole new perspective on how much more central the
cache is to speed these days. I think many more programs could
benefit, actually.

For the heck of it, I just compiled md5 with it standard O3 and with
Os. It made a tiny yet
of 150 to 200 ms in favour of Os. (on a a scale of 26 seconds; I used
kcore, so caching should have had no effect [actually, on a 100MB file
in shm, well within cacheability, O3 was faster, by abotu the same
amont, on a 3 second scale (I did this on a slower processor,
'course)]) But md5 isn't the best example, as it's data crunching, and
not very complex code. I'ld be amazed if it weren't completely cached
either way - the seven hundred bytes or so in executable size
difference (scale of 13K) saved there aren't the most important ever -
except that can be taken to be purely in code, which is partly the

So I'm even more interested now. It would amuse me if you would humour
me for a few minutes and compile and run a few of your own C programs
(that don't bottleneck on syscalls or io) with Os, see whether it
makes a difference, and in what direction and magnitude.

Just for kicks, I just compiled my kernel (2.6.7 ck) with Os. It's a
shame it'd hard to measure the speed difference, but it does knock
300K (~12%) off its size. I think I'll run it for a while, see what

--Bart Alewijnse

gentoo-performance@g.o mailing list


Subject Author
Re: [gentoo-performance] inline considered harmful Jerry McBride <mcbrides9@×××××××.net>