Gentoo Archives: gentoo-dev

From: Ryan Hill <dirtyepic.sk@×××××.com>
To: gentoo-dev@l.g.o
Subject: [gentoo-dev] Re: Recommended -march settings [was: Re: CFLAGS paragraph submission for the GWN]
Date: Sun, 15 Oct 2006 05:49:01
Message-Id: egshu4$gfp$1@sea.gmane.org
In Reply to: Re: Recommended -march settings [was: Re: [gentoo-dev] CFLAGS paragraph submission for the GWN] by Mike Frysinger
1 Mike Frysinger wrote:
2 > On Saturday 14 October 2006 04:49, Sebastian Bergmann wrote:
3
4 >> CFLAGS="-march=prescott -O2 -pipe -fomit-frame-pointer"
5
6 > here's a good reason why gentoo-wiki is not official ... this is wrong. the
7 > duo cpu's are not based on the pentium4 which is what the prescott is
8
9 That was put there by me. The thing is, while the Core CPUs have more
10 in common with the Pentium-M micro-architecture, -march=pentium-m highly
11 favors generating x87 over SSE/SSE2 instructions, since on a Pentium-M
12 doing SSE/SSE2 was somewhere in the neighbourhood of 30% slower. Core
13 chips have improved decoding and use micro-op fusion to combine up to
14 four SSE instructions. Also, with code doing fp to int conversion or
15 single precision division, SSE scalar ops are the win on Core (though
16 not as big as Netburst) since x87 instructions have to write data to
17 memory and read it again to reduce precision.
18
19 Based on that I've been doing benchmarks with GCC 4.1 and trunk and I
20 usually find '-march=prescott -mfpmath=sse' to do a bit better than
21 '-march=pentium-m -mfpmath=sse -msse3', and just plain '-march=prescott'
22 to be near identical to plain '-march=pentium-m' (for those ebuilds that
23 call strip-flags ;), though the latter is on average <=1% faster.
24 '-march=pentium-m -msse3' has actually been the worst performer, though
25 I have no idea why it's slower than just '-march=pentium-m'. To be
26 honest I don't really trust GCC's SSE3 support in it's current state.
27
28 I've looked hard and long for an official answer to this but no one
29 seems to be able to give a concrete reason why one is better than the
30 other, other than "it's based on the Pentium-M". It _is_, but it's
31 still a very different animal. Until I got one I thought it'd be best
32 to document both.
33
34
35 --de.

Attachments

File name MIME type
signature.asc application/pgp-signature