Gentoo Archives: gentoo-performance

From: Paul de Vrieze <pauldv@g.o>
To: gentoo-performance@l.g.o
Subject: Re: [gentoo-performance] performance testing
Date: Thu, 10 May 2007 04:40:53
Message-Id: 4642A19B.2030406@gentoo.org
In Reply to: Re: [gentoo-performance] performance testing by Miguel Sousa Filipe
1 Miguel Sousa Filipe wrote:
2 >
3 > A good CFLAGS would be something not very agressive, something like:
4 > -march=<cpuType> -O3 or -O2 and at most -fomit-frame-pointer.
5 > (Scientific workloads can speedup considerably with: -ffast-math)
6
7 You must be careful on cpuflags. -ffast-math should never be used
8 globally. Applications that benefit from it specify it themselves. The
9 most realistic is -O2.
10
11 The remark about processors being optimized for i386 is a bit off beat.
12 Actually what -march does is enable additional instructions to be used
13 that are faster (and available) on those processors. Especially mmx{,2},
14 sse{,2,3} are very beneficial to certain kinds of applications, and can
15 give big speedups. Considering that, however, many applications that are
16 sensitive to this such as mplayer can do run-time cpu detection and use
17 different code (often written in assembler) depending on the cpu at runtime.
18
19 In general it is not wise to use -O3. Most distributions compile with
20 -O2. -O3 therefore is less well tested and may expose bugs in both the
21 compiler and (more importantly) the applications.
22
23 Besides the instructions that are available to the compiler (influenced
24 by -march) there is then the issue of instruction scheduling. To put it
25 simply, all modern processors have the possibility to process multiple
26 instructions in parallel when they do not conflict. The precise
27 architectures of this differ and the ability to parallelize depends
28 quite much on the scheduling of the instructions. This is influenced by
29 the -mtune flag (set implicitly by -march). This is an optimization that
30 is most likely also present in a binary distribution. Gentoo can however
31 optimize to the precise processor used.
32
33 Paul
34
35 --
36 gentoo-performance@g.o mailing list