Gentoo Archives: gentoo-performance

From: Paul de Vrieze <pauldv@g.o>
To: gentoo-performance@l.g.o
Subject: Re: [gentoo-performance] performance testing
Date: Thu, 10 May 2007 04:40:53
In Reply to: Re: [gentoo-performance] performance testing by Miguel Sousa Filipe
Miguel Sousa Filipe wrote:
> > A good CFLAGS would be something not very agressive, something like: > -march=<cpuType> -O3 or -O2 and at most -fomit-frame-pointer. > (Scientific workloads can speedup considerably with: -ffast-math)
You must be careful on cpuflags. -ffast-math should never be used globally. Applications that benefit from it specify it themselves. The most realistic is -O2. The remark about processors being optimized for i386 is a bit off beat. Actually what -march does is enable additional instructions to be used that are faster (and available) on those processors. Especially mmx{,2}, sse{,2,3} are very beneficial to certain kinds of applications, and can give big speedups. Considering that, however, many applications that are sensitive to this such as mplayer can do run-time cpu detection and use different code (often written in assembler) depending on the cpu at runtime. In general it is not wise to use -O3. Most distributions compile with -O2. -O3 therefore is less well tested and may expose bugs in both the compiler and (more importantly) the applications. Besides the instructions that are available to the compiler (influenced by -march) there is then the issue of instruction scheduling. To put it simply, all modern processors have the possibility to process multiple instructions in parallel when they do not conflict. The precise architectures of this differ and the ability to parallelize depends quite much on the scheduling of the instructions. This is influenced by the -mtune flag (set implicitly by -march). This is an optimization that is most likely also present in a binary distribution. Gentoo can however optimize to the precise processor used. Paul -- gentoo-performance@g.o mailing list