1 |
Duncan wrote: |
2 |
> Daniel Iliev <danny@××××××××.com> posted 451A110B.2080401@××××××××.com, |
3 |
> excerpted below, on Wed, 27 Sep 2006 08:50:03 +0300: |
4 |
> |
5 |
> |
6 |
>> So let me start a with 2 newbie questions caused by my first impressions |
7 |
>> from the x86_64 world: |
8 |
>> |
9 |
>> 1) I use CFLAGS="-march=athlon64 -mfpmath=sse -msse -msse2 -msse3 |
10 |
>> -m3dnow -mmmx -O3 -fomit-frame-pointer -pipe -fpic". Portage complains |
11 |
>> with *red letters* about the fpic flag. Every time I emerge something it |
12 |
>> says that "fpic breaks things", but I haven't met a single breakage so |
13 |
>> far. Is that a bug? Actually there was an ebuild which could not be |
14 |
>> compiled if mysql was compiled w/o "fpic". I'm not 100% sure but AFAIR |
15 |
>> it was dev-perl/DBD-mysql. |
16 |
>> |
17 |
>> 2) I see too many flags that are disabled by the profile - the kind with |
18 |
>> the parenthesis around them, like "(-3dnow)". Why? As I mentioned above |
19 |
>> I enable some of these through my CFLAGS - e.g. (-mmx), (-mmxext), |
20 |
>> (-sse) and (-sse2) and everything works perfect. |
21 |
>> |
22 |
> |
23 |
> It seems that you missed some of the Gentoo/AMD64 documentation. |
24 |
> Many/most of your questions are answered there. Unfortunately, I'm not |
25 |
> aware of a simple easy to use list of everything in one spot, so it's |
26 |
> reading a bit of documentation here, a bit more there, etc. |
27 |
> |
28 |
> The main Gentoo/AMD64 project page. (This would be the logical place for |
29 |
> such a list, but it's more the project page, tho it links some of the |
30 |
> docs, it's just not as easy to find those links as it could be.) |
31 |
> http://amd64.gentoo.org |
32 |
> |
33 |
> Gentoo/AMD64 FAQ: |
34 |
> http://www.gentoo.org/doc/en/gentoo-amd64-faq.xml |
35 |
> |
36 |
> Gentoo/AMD64 HOWTOs. (There's one on -fPIC here, tho the explanation is |
37 |
> a bit developer-centric.) |
38 |
> http://www.gentoo.org/proj/en/base/amd64/howtos/index.xml |
39 |
> |
40 |
> A brief direct answer to your questions follows: |
41 |
> |
42 |
> * The sse etc CFLAGS are arch dependent. Unlike x86 where the |
43 |
> mmx/sse/other-extensions instructions were added as the arch matured, on |
44 |
> amd64, they are part of the definition of the arch itself. All x86_64 |
45 |
> (amd64) CPUs will have mmx/sse/sse2, etc. Thus, -march=athlon64 already |
46 |
> tells gcc these are available to use where it wants/needs to. The others |
47 |
> don't therefore provide gcc any more information than what it already has. |
48 |
> |
49 |
> * -fomit-frame-pointer isn't needed on 64-bit amd64 either, as it's turned |
50 |
> on for all -O levels on archs (including amd64) where doing so doesn't |
51 |
> interfere with debugging. (See the gcc manpage, under -O optimization.) |
52 |
> You may wish to continue to specify it for stuff that's compiled for |
53 |
> 32-bit, however, including parts of gcc, a version of glibc, a version of |
54 |
> the (portage) sandbox library, etc. |
55 |
> |
56 |
> * Generally speaking, -fPIC is required on amd64 for ALL LIBRARIES but the |
57 |
> ebuilds normally take care of it. Under certain circumstances (like |
58 |
> unsupported CFLAGS), the configure scripts will turn it off by mistake, see |
59 |
> the above mentioned -fPIC HOWTO link for details, but the solution isn't |
60 |
> to add it to your CFLAGS, as that means it will be used for executable |
61 |
> applications as well as libraries, and /some/ applications /do/ break with |
62 |
> it. Not many, but some, and if it's in your CFLAGS, you WILL have bugs |
63 |
> you file closed as INVALID or the like, due to CFLAG abuse. If there's |
64 |
> something not working without it, then THAT'S a bug and should be filed as |
65 |
> such (unless it's due to use of CFLAGS gcc doesn't support and warns |
66 |
> about, thus triggering the configure script detection problem discussed |
67 |
> above and in the HOWTO). |
68 |
> |
69 |
> * The profile "disabled" USE flags are simply hard-locked either on or |
70 |
> off by the profile, so aren't a USE flag option. It does NOT mean whatever |
71 |
> the USE flag controls is actually disabled. Sometimes, as with the |
72 |
> multilib USE flag, it can mean it's /enabled/. It just means that the |
73 |
> profile is set up to control it, generally for a pretty good reason. In |
74 |
> the particular cases you mention, the way Gentoo uses the SSE and similar |
75 |
> USE flags is 32-bit specific, enabling 32-bit specific assembler code in |
76 |
> the ebuild, for instance. As already mentioned, the AMD64 arch by |
77 |
> definition already has these features activated, so no 64-bit USE flags |
78 |
> are necessary, and enabling the 32-bit USE flags will cause breakage since |
79 |
> it activates 32-bit specific code in many instances. Thus the amd64 |
80 |
> profiles have a /very/ good reason to hard-lock these USE flags "off". An |
81 |
> example where a USE flag is hard-locked ON by a profile would be multilib. |
82 |
> The normal AMD64 profiles are all multilib and thus lock this flag ON (tho |
83 |
> it's still shown as disabled), while 64-bit-only profiles lock it OFF. |
84 |
> |
85 |
> A couple of other notes: |
86 |
> |
87 |
> Portage now supports per-package CFLAGS and certain other variables as |
88 |
> controlled by the environment (as long as they are used in an ebuild.sh |
89 |
> phase, not the python phase, since execution is via a bashrc hook). |
90 |
> Create /etc/portage/env/<category> as a directory, populated with package |
91 |
> or package-version files. The contents of these files will be sourced |
92 |
> into the ebuild.sh execution environment for every phase that uses |
93 |
> ebuild.sh. CFLAGS and similar variables as found in these files REPLACE |
94 |
> (that is, they don't add to, they replace entirely) the default make.conf |
95 |
> CFLAGS. You can use this mechanism to specify specific CFLAGS for |
96 |
> specific packages, and could thus set -fomit-frame-pointer and other |
97 |
> 32-bit x86 specific CFLAGS here if desired, avoiding them in your regular |
98 |
> make.conf. |
99 |
> |
100 |
> You may wish to read a bit of the archives for this list, in particular, |
101 |
> the recent threads on gcc 4.1.1 CFLAGS, where I discuss mine. |
102 |
> Specifically, it's likely -O3 is actually /worse/ performing in many |
103 |
> instances than -O2 or even -Os (my choice). The reasoning is this: CPU |
104 |
> cycles are fairly cheap in a modern processor, while the expense of |
105 |
> waiting on main memory in the case of a cache miss is MUCH HIGHER, due to |
106 |
> the fact that main memory is clocked so much slower than cache. Smaller |
107 |
> code fits in cache better and is thus often faster than larger code, even |
108 |
> when the smaller code isn't as theoretically CPU cycle efficient. While |
109 |
> there will certainly be certain applications where -O3 is beneficial, I |
110 |
> believe if you do actual comparisons, you will find -O2 or -Os faster on a |
111 |
> system-wide basis. Of course, it's up to you and much virtual ink has |
112 |
> been spilled discussing this issue, but that's just my take on things. If |
113 |
> you've actually done speed comparisons on AMD64 or can point to some, I'd |
114 |
> certainly be interested, as I've honestly not cared enough about it to do |
115 |
> my own, but that's my general take in the absence of specific hard data to |
116 |
> the contrary. Rather than optimizing for CPU cycles (-O3), I choose to |
117 |
> optimize for better register usage (registers being at full CPU speed, |
118 |
> therefore faster even than L1 cache, -frename-registers and etc) size |
119 |
> (-Os, disabling loop unrolling), whole and multiple unit optimization |
120 |
> (-funit-at-a-time, -combine) and hot/cold partitioning |
121 |
> (-freorder-blocks-and-partition, tho it can't be used on C++ code, etc). A |
122 |
> few of my flags fail on a very few specific packages, another use for the |
123 |
> package specific CFLAGS stuff above. |
124 |
> |
125 |
> |
126 |
Very detailed answer! Thank you! |
127 |
|
128 |
Yes, you are right. I have missed the "AMD64 HowTo" documentation. I |
129 |
found only the FAQ via Google. |
130 |
It was the easiest (fastest) way to get some answers. ;-) |
131 |
|
132 |
Thank you all. |
133 |
|
134 |
-- |
135 |
gentoo-amd64@g.o mailing list |