Gentoo Archives: gentoo-amd64

From: Daniel Iliev <danny@××××××××.com>
To: gentoo-amd64@l.g.o
Subject: Re: [gentoo-amd64] Re: First Impressions
Date: Wed, 27 Sep 2006 10:36:12
Message-Id: 451A539F.3040602@ilievnet.com
In Reply to: [gentoo-amd64] Re: First Impressions by Duncan <1i5t5.duncan@cox.net>
1 Duncan wrote:
2 > Daniel Iliev <danny@××××××××.com> posted 451A110B.2080401@××××××××.com,
3 > excerpted below, on Wed, 27 Sep 2006 08:50:03 +0300:
4 >
5 >
6 >> So let me start a with 2 newbie questions caused by my first impressions
7 >> from the x86_64 world:
8 >>
9 >> 1) I use CFLAGS="-march=athlon64 -mfpmath=sse -msse -msse2 -msse3
10 >> -m3dnow -mmmx -O3 -fomit-frame-pointer -pipe -fpic". Portage complains
11 >> with *red letters* about the fpic flag. Every time I emerge something it
12 >> says that "fpic breaks things", but I haven't met a single breakage so
13 >> far. Is that a bug? Actually there was an ebuild which could not be
14 >> compiled if mysql was compiled w/o "fpic". I'm not 100% sure but AFAIR
15 >> it was dev-perl/DBD-mysql.
16 >>
17 >> 2) I see too many flags that are disabled by the profile - the kind with
18 >> the parenthesis around them, like "(-3dnow)". Why? As I mentioned above
19 >> I enable some of these through my CFLAGS - e.g. (-mmx), (-mmxext),
20 >> (-sse) and (-sse2) and everything works perfect.
21 >>
22 >
23 > It seems that you missed some of the Gentoo/AMD64 documentation.
24 > Many/most of your questions are answered there. Unfortunately, I'm not
25 > aware of a simple easy to use list of everything in one spot, so it's
26 > reading a bit of documentation here, a bit more there, etc.
27 >
28 > The main Gentoo/AMD64 project page. (This would be the logical place for
29 > such a list, but it's more the project page, tho it links some of the
30 > docs, it's just not as easy to find those links as it could be.)
31 > http://amd64.gentoo.org
32 >
33 > Gentoo/AMD64 FAQ:
34 > http://www.gentoo.org/doc/en/gentoo-amd64-faq.xml
35 >
36 > Gentoo/AMD64 HOWTOs. (There's one on -fPIC here, tho the explanation is
37 > a bit developer-centric.)
38 > http://www.gentoo.org/proj/en/base/amd64/howtos/index.xml
39 >
40 > A brief direct answer to your questions follows:
41 >
42 > * The sse etc CFLAGS are arch dependent. Unlike x86 where the
43 > mmx/sse/other-extensions instructions were added as the arch matured, on
44 > amd64, they are part of the definition of the arch itself. All x86_64
45 > (amd64) CPUs will have mmx/sse/sse2, etc. Thus, -march=athlon64 already
46 > tells gcc these are available to use where it wants/needs to. The others
47 > don't therefore provide gcc any more information than what it already has.
48 >
49 > * -fomit-frame-pointer isn't needed on 64-bit amd64 either, as it's turned
50 > on for all -O levels on archs (including amd64) where doing so doesn't
51 > interfere with debugging. (See the gcc manpage, under -O optimization.)
52 > You may wish to continue to specify it for stuff that's compiled for
53 > 32-bit, however, including parts of gcc, a version of glibc, a version of
54 > the (portage) sandbox library, etc.
55 >
56 > * Generally speaking, -fPIC is required on amd64 for ALL LIBRARIES but the
57 > ebuilds normally take care of it. Under certain circumstances (like
58 > unsupported CFLAGS), the configure scripts will turn it off by mistake, see
59 > the above mentioned -fPIC HOWTO link for details, but the solution isn't
60 > to add it to your CFLAGS, as that means it will be used for executable
61 > applications as well as libraries, and /some/ applications /do/ break with
62 > it. Not many, but some, and if it's in your CFLAGS, you WILL have bugs
63 > you file closed as INVALID or the like, due to CFLAG abuse. If there's
64 > something not working without it, then THAT'S a bug and should be filed as
65 > such (unless it's due to use of CFLAGS gcc doesn't support and warns
66 > about, thus triggering the configure script detection problem discussed
67 > above and in the HOWTO).
68 >
69 > * The profile "disabled" USE flags are simply hard-locked either on or
70 > off by the profile, so aren't a USE flag option. It does NOT mean whatever
71 > the USE flag controls is actually disabled. Sometimes, as with the
72 > multilib USE flag, it can mean it's /enabled/. It just means that the
73 > profile is set up to control it, generally for a pretty good reason. In
74 > the particular cases you mention, the way Gentoo uses the SSE and similar
75 > USE flags is 32-bit specific, enabling 32-bit specific assembler code in
76 > the ebuild, for instance. As already mentioned, the AMD64 arch by
77 > definition already has these features activated, so no 64-bit USE flags
78 > are necessary, and enabling the 32-bit USE flags will cause breakage since
79 > it activates 32-bit specific code in many instances. Thus the amd64
80 > profiles have a /very/ good reason to hard-lock these USE flags "off". An
81 > example where a USE flag is hard-locked ON by a profile would be multilib.
82 > The normal AMD64 profiles are all multilib and thus lock this flag ON (tho
83 > it's still shown as disabled), while 64-bit-only profiles lock it OFF.
84 >
85 > A couple of other notes:
86 >
87 > Portage now supports per-package CFLAGS and certain other variables as
88 > controlled by the environment (as long as they are used in an ebuild.sh
89 > phase, not the python phase, since execution is via a bashrc hook).
90 > Create /etc/portage/env/<category> as a directory, populated with package
91 > or package-version files. The contents of these files will be sourced
92 > into the ebuild.sh execution environment for every phase that uses
93 > ebuild.sh. CFLAGS and similar variables as found in these files REPLACE
94 > (that is, they don't add to, they replace entirely) the default make.conf
95 > CFLAGS. You can use this mechanism to specify specific CFLAGS for
96 > specific packages, and could thus set -fomit-frame-pointer and other
97 > 32-bit x86 specific CFLAGS here if desired, avoiding them in your regular
98 > make.conf.
99 >
100 > You may wish to read a bit of the archives for this list, in particular,
101 > the recent threads on gcc 4.1.1 CFLAGS, where I discuss mine.
102 > Specifically, it's likely -O3 is actually /worse/ performing in many
103 > instances than -O2 or even -Os (my choice). The reasoning is this: CPU
104 > cycles are fairly cheap in a modern processor, while the expense of
105 > waiting on main memory in the case of a cache miss is MUCH HIGHER, due to
106 > the fact that main memory is clocked so much slower than cache. Smaller
107 > code fits in cache better and is thus often faster than larger code, even
108 > when the smaller code isn't as theoretically CPU cycle efficient. While
109 > there will certainly be certain applications where -O3 is beneficial, I
110 > believe if you do actual comparisons, you will find -O2 or -Os faster on a
111 > system-wide basis. Of course, it's up to you and much virtual ink has
112 > been spilled discussing this issue, but that's just my take on things. If
113 > you've actually done speed comparisons on AMD64 or can point to some, I'd
114 > certainly be interested, as I've honestly not cared enough about it to do
115 > my own, but that's my general take in the absence of specific hard data to
116 > the contrary. Rather than optimizing for CPU cycles (-O3), I choose to
117 > optimize for better register usage (registers being at full CPU speed,
118 > therefore faster even than L1 cache, -frename-registers and etc) size
119 > (-Os, disabling loop unrolling), whole and multiple unit optimization
120 > (-funit-at-a-time, -combine) and hot/cold partitioning
121 > (-freorder-blocks-and-partition, tho it can't be used on C++ code, etc). A
122 > few of my flags fail on a very few specific packages, another use for the
123 > package specific CFLAGS stuff above.
124 >
125 >
126 Very detailed answer! Thank you!
127
128 Yes, you are right. I have missed the "AMD64 HowTo" documentation. I
129 found only the FAQ via Google.
130 It was the easiest (fastest) way to get some answers. ;-)
131
132 Thank you all.
133
134 --
135 gentoo-amd64@g.o mailing list