1 |
On Tue, Dec 9, 2008 at 1:23 PM, Sami Näätänen <sn.ml@××××××××××××.fi> wrote: |
2 |
> So hi from a amd64 newbie. Not so newbie with Gentoo though. :) |
3 |
> |
4 |
> My system is an Intel quad core core2 with a 2.4 GHz clock speed coupled with |
5 |
> a 4GB of memory. No overclocking etc. Want this to be stable. :) |
6 |
> |
7 |
> I'm just curious what people use as their stable CFLAGS in amd64 Gentoo? |
8 |
> (Sorry if this has been up lately, but I just switched to 64bit env so...) |
9 |
> |
10 |
> |
11 |
> Here is mine and some explanation of why (And I use ~arch system with gcc 4.3) |
12 |
> |
13 |
> The flags are in order they are used in my CFLAGS and CXXFLAGS. |
14 |
> |
15 |
> Gives stable base |
16 |
> -O2 |
17 |
> |
18 |
> Want to optimize for my system, but don't want "native" |
19 |
> -march=core2 |
20 |
> |
21 |
> If some ebuilds filter march this will still cache optimize etc for my system |
22 |
> -mtune=core2 |
23 |
> |
24 |
> Faster floating point math and better chance of vectorization |
25 |
> -mfpmath=sse |
26 |
> |
27 |
> These because of the march might get filtered |
28 |
> -mmmx -msse -msse2 -msse3 -mssse3 |
29 |
> |
30 |
> For loop vectorization |
31 |
> -ftree-vectorize |
32 |
> |
33 |
> Just to get some Idea how much vectorized loops there will be. |
34 |
> By the way I surprised the amount of "LOOP VECTORIZED" notes in the compile |
35 |
> output. And only have seen couple of two versions |
36 |
> -ftree-vectorizer-verbose=1 |
37 |
> |
38 |
> Of course I don't want temp files :) |
39 |
> -pipe |
40 |
> |
41 |
> |
42 |
> I don't use any loop unrolling etc, because it would only add the code size. |
43 |
> I'm not so brave that I would dare to use -Os. |
44 |
> |
45 |
> So what's your experiences and reasoning behind what you do? |
46 |
> Any benchmarks or so? |
47 |
> |
48 |
> |
49 |
> PS. If you see same post without this added postscript. Just ignore it, it's |
50 |
> the same post, but I forgot to change my default identity for this ML. |
51 |
> |
52 |
|
53 |
Dear Sami, |
54 |
|
55 |
I have a Q9300 and used this: |
56 |
|
57 |
CFLAGS="-march=nocona -O2 -pipe" |
58 |
CXXFLAGS="${CFLAGS}" |
59 |
USE="mmx sse sse2 <snip>" |
60 |
|
61 |
Stable gcc version is 4.2.x. I switched to 4.3.2 by adding |
62 |
|
63 |
>=sys-devel/gcc-4.3.2 |
64 |
>=sys-libs/glibc-2.7-r2 |
65 |
|
66 |
to /etc/portage/package.keywords. With 4.3.2 I use: |
67 |
|
68 |
CFLAGS="-march=native -O2 -pipe" |
69 |
|
70 |
With only a small effort, you get most of the benefits. So fine-tuning |
71 |
to the edge will give you issues to solve with only a very small |
72 |
percentage of performance increase in return. |
73 |
|
74 |
My 2 cents.. |
75 |
|
76 |
Martin |