1 |
Duncan wrote: |
2 |
> |
3 |
> |
4 |
> Well, you say you want stable, but then say you use ~arch, so I see |
5 |
> you're not too stick in the mud. =:^) |
6 |
> |
7 |
> Here's mine, for a dual Opteron 290: |
8 |
> |
9 |
> CFLAGS="-march=opteron-sse3 -pipe -O2 -frename-registers -fweb -fmerge- |
10 |
> all-constants -fgcse-sm -fgcse-las -fgcse-after-reload -ftree-vectorize - |
11 |
> fdirectives-only -freorder-blocks-and-partition -combine" |
12 |
> |
13 |
> CXXFLAGS="-march=opteron-sse3 -pipe -O2 -frename-registers -fweb -fmerge- |
14 |
> all-constants -fgcse-sm -fgcse-las -fgcse-after-reload -ftree-vectorize - |
15 |
> fdirectives-only" |
16 |
> |
17 |
> You can look them up in the gcc manpage, or look back a year or so when I |
18 |
> explained most of them, altho that was a couple gcc versions ago and they |
19 |
> weren't quite the same. |
20 |
> |
21 |
> |
22 |
<SNIP> |
23 |
|
24 |
|
25 |
Been there, done praactically that, but it didn't make one quark of |
26 |
difference overall, except throwing gcc in a coma now and then, |
27 |
lenghtening compile problems and causing odd ( but rare ) bugs. |
28 |
|
29 |
I tried to time several C programs of mine and found that plain -O1 |
30 |
worked substantially better than plain -O2. |
31 |
|
32 |
After that, I said sod all and used plain vanilla CLFAGS on new gcc and |
33 |
with right march. Works fine, with same speed, faster compiles and much |
34 |
less headaches on average. |
35 |
|
36 |
In my experience, exotic CFLAGS can make a difference, but this varies |
37 |
wildldy from program part to program part, so unless one knows exactly |
38 |
what he is doing, he might be better of trusting compiler to use sane |
39 |
path with -O2. Besides that, portage doesn't have an option to compile |
40 |
just some part of the code with another, non_default CFLAGS... |
41 |
|
42 |
|
43 |
|
44 |
> But my basic strategy is this: Because memory is so much slower than |
45 |
> cache on a modern processor, in general it should pay to optimize for |
46 |
> size even if it costs a few CPU cycles once in awhile. |
47 |
True, but he is asking for P4, which was notorious for having long |
48 |
pipelina and a neadache after cache miss, so for him -O2 or even -03 |
49 |
might be better in _some_ cases. |
50 |
But even so, IMVHO it is simply not worth the time and effort to fiddle |
51 |
with this, I'd use golden default with right march here also and be |
52 |
done with it. |