Gentoo Archives: gentoo-user

From: Volker Armin Hemmann <volkerarmin@××××××××××.com>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] USE="mmx mmxext sse sse2 ssse3 3dnow 3dnowext"
Date: Fri, 29 May 2009 06:36:23
Message-Id: 200905290836.15586.volkerarmin@googlemail.com
In Reply to: Re: [gentoo-user] USE="mmx mmxext sse sse2 ssse3 3dnow 3dnowext" by Graham Murray
1 On Freitag 29 Mai 2009, Graham Murray wrote:
2 > Stroller <stroller@××××××××××××××××××.uk> writes:
3 > > But, surely "-march=" also instructs gcc to support the additional
4 > > instructions. Suggest you re-read Daniel's post that I was replying
5 > > to.
6 > >
7 > > What's the difference between supporting the "certain set of
8 > > instructions" with "-march=" and doing so with USEs?
9 > >
10 > > Or doesn't "-march=" support additional "certain sets of
11 > > instructions". What does it do, then?
12 >
13 > I am not sure,
14 >
15 > $ gcc -Q --help=target -march=core2
16 > The following options are target specific:
17 > -m128bit-long-double [disabled]
18 > -m32 [enabled]
19 > -m3dnow [disabled]
20 > -m3dnowa [disabled]
21 > -m64 [disabled]
22 > -m80387 [enabled]
23 > -m96bit-long-double [enabled]
24 > -mabm [disabled]
25 > -maccumulate-outgoing-args [disabled]
26 > -maes [disabled]
27 > -malign-double [disabled]
28 > -malign-functions=
29 > -malign-jumps=
30 > -malign-loops=
31 > -malign-stringops [enabled]
32 > -march= core2
33 > -masm=
34 > -mavx [disabled]
35 > -mbranch-cost=
36 > -mcld [disabled]
37 > -mcmodel=
38 > -mcx16 [disabled]
39 > -mfancy-math-387 [enabled]
40 > -mfma [disabled]
41 > -mforce-drap [disabled]
42 > -mfp-ret-in-387 [enabled]
43 > -mfpmath=
44 > -mfused-madd [enabled]
45 > -mglibc [enabled]
46 > -mhard-float [enabled]
47 > -mieee-fp [enabled]
48 > -mincoming-stack-boundary=
49 > -minline-all-stringops [disabled]
50 > -minline-stringops-dynamically [disabled]
51 > -mintel-syntax [disabled]
52 > -mlarge-data-threshold=
53 > -mmmx [disabled]
54 > -mms-bitfields [disabled]
55 > -mno-align-stringops [disabled]
56 > -mno-fancy-math-387 [disabled]
57 > -mno-fused-madd [disabled]
58 > -mno-push-args [disabled]
59 > -mno-red-zone [disabled]
60 > -mno-sse4 [enabled]
61 > -momit-leaf-frame-pointer [disabled]
62 > -mpc
63 > -mpclmul [disabled]
64 > -mpopcnt [disabled]
65 > -mpreferred-stack-boundary=
66 > -mpush-args [enabled]
67 > -mrecip [disabled]
68 > -mred-zone [enabled]
69 > -mregparm=
70 > -mrtd [disabled]
71 > -msahf [disabled]
72 > -msoft-float [disabled]
73 > -msse [disabled]
74 > -msse2 [disabled]
75 > -msse2avx [disabled]
76 > -msse3 [disabled]
77 > -msse4 [disabled]
78 > -msse4.1 [disabled]
79 > -msse4.2 [disabled]
80 > -msse4a [disabled]
81 > -msse5 [disabled]
82 > -msseregparm [disabled]
83 > -mssse3 [disabled]
84 > -mstack-arg-probe [disabled]
85 > -mstackrealign [enabled]
86 > -mstringop-strategy=
87 > -mtls-dialect=
88 > -mtls-direct-seg-refs [enabled]
89 > -mtune=
90 > -muclibc [disabled]
91 > -mveclibabi=
92
93 and man gcc says:
94 --help={class|[^]qualifier}[,...]
95 Print (on the standard output) a description of the command line
96 options understood by the compiler that fit into all specified
97 classes and qualifiers. These are the supported classes:
98
99 target
100 This will display target-specific options. Unlike the --target-help
101 option however, target-specific options of the linker
102 and assembler will not be displayed. This is because those
103 tools do not currently support the extended --help= syntax.
104
105 Which leads to the conclusion, that it only shows options that can be set. Not
106 options how they are really set (besides a few that are enabled by the
107 architecture). If you leave -march=core2 out. you will probably get the same
108 result.
109
110 for example:
111 -mfpmath=unit
112 Generate floating point arithmetics for selected unit unit. The
113 choices for unit are:
114
115 387 Use the standard 387 floating point coprocessor present
116 majority of chips and emulated otherwise. Code compiled with this
117 option will run almost everywhere. The temporary results are
118 computed in 80bit precision instead of precision specified by
119 the type resulting in slightly different results compared to
120 most of other chips. See -ffloat-store for more detailed
121 description.
122
123 This is the default choice for i386 compiler.
124
125 sse Use scalar floating point instructions present in the SSE
126 instruction set. This instruction set is supported by Pentium3
127 and newer chips, in the AMD line by Athlon-4, Athlon-xp and
128 Athlon-mp chips. The earlier version of SSE instruction set
129 supports only single precision arithmetics, thus the double and
130 extended precision arithmetics is still done using 387.
131 Later version, present only in Pentium4 and the future AMD
132 x86-64 chips supports double precision arithmetics too.
133
134 For the i386 compiler, you need to use -march=cpu-type, -msse
135 or -msse2 switches to enable SSE extensions and make this
136 option effective. For the x86-64 compiler, these extensions
137 are enabled by default.
138
139 The resulting code should be considerably faster in the
140 majority of cases and avoid the numerical instability problems of
141 387 code, but may break some existing code that expects
142 temporaries to be 80bit.
143
144 This is the default choice for the x86-64 compiler.
145
146 as you can see from the man excerpt, if the help showed enabled options,
147 mfpmath=sse should be there. It isn't.
148
149 ...