Beso <givemesugarr@...> posted
d257c3560802060159q3a5b0334gd40b5d8b25aa348f@..., excerpted
below, on Wed, 06 Feb 2008 09:59:29 +0000:
>> CFLAGS="-march=athlon64 -O2 -pipe"
>
>
> i suggest this flags instead of yours usually they're a little more
> processor based and are not experimental ones as usually they say about
> them:
> CFLAGS="-Os -march=k8 -mno-tls-direct-seg-refs -mmmx -msse3 -pipe
> -fomit-frame-pointer"
-fomit-frame-pointer is the default for -O and above on amd64/x86_64, as
it doesn't interfere with debugging here as it does on 32-bit x86, so
omitting it saves on complexity without changing the compiled result. Of
course, it's still useful for 32-bit, but there's only a very few
programs that portage compiles as 32-bit on amd64 anyway (and even there,
CFLAGS aren't really used except for sandbox, because gcc does its own
bootstrapping and glibc is considered critical enough it basically sets
its own very strict CFLAGS in the ebuild).
-Os I used to use, for reasons explained in detail in a post to the list
some time ago but briefly, cache size restrictions tend to matter more
than pure cycle optimization on modern processors, so -Os generally made
more sense to me than -O2. However, with gcc-4.2, several optmizations
that used to be in -Os only, made it into -O2, and -O2 now includes a
couple of cache optimizers that raw size optimization misses, so it's not
as critical, now. Given that -O2 tends to be the recommended for Gentoo
and more widely tested than -Os, I've therefore switched back to it.
-mmmx is included in the default for -march=k8, so that should be
superfluous. However, -msse3 is NOT included, as the original amd chips
didn't have it. Newer amd CPUs DO include sse3 (look for pni in the
flags line(s) of /proc/cpuinfo), so unlike -mmmx it's worth specifying
-msse3 if your CPU has it.
What about -mno-tls-direct-seg-refs? Why do you use that? The gcc
manpage doesn't have enough info to convince me it's useful, and in fact,
implies the reverse, since it defaults glibc folks to using it, which
implies a reason for that default. I'd therefore love to see a
discussion of the flag with enough information to justify a better
informed decision, as yours presumably is given your depth of knowledge
and activity on this list.
FWIW, here's my CFLAGS. CXXFLAGS are similar, minus -combine (mentioned
below) and -freorder-blocks-and-partition (for similar reasons).
-combine and -ftree-vectorize still cause occasional problems, which I
avoid on a case by case basis as I encounter them, with an appropriate
entry in /etc/portage/env/<cat>/<pkg>, but in general, these flags have
worked well for me for some time. I can explain why I use each one if
anyone wants to know.
CFLAGS="-march=k8 -msse3 -O2 -pipe -frename-registers -fweb -ftree-
vectorize -freorder-blocks-and-partition -combine -fgcse-sm -fgcse-las -
fgcse-after-reload -fmerge-all-constants"
>> CXXFLAGS="-march=athlon64 -O2 -pipe"
>
> this is the same and is faster to write and to have it updated.
> CXXFLAGS="${CFLAGS}"
The exception would be if your CFLAGS include things like -combine, which
doesn't work so well on C++ so shouldn't be in CXXFLAGS. (For -combine
specifically, gcc ignores it in such cases, but spits a warning, which
various configure scripts interpret as a problem causing them to screwup,
so it's best to leave it out of CXXFLAGS entirely.)
> also add:
> LDFLAGS="-Wl,--as-needed,-O1 -Wl,--enable-new-dtags -Wl,--sort-common
> -s"
Could you point me to documentation on LDFLAGS in general, or at least
--enable-new-dtags and --sort-common? I use --as-needed already
( documentation at http://www.gentoo.org/proj/en/qa/asneeded.xml ), as
well as -z,now (security thing as mentioned in portage's QA warnings if
you have them enabled for setuid/setgid applications, but I find it
useful in general, and no, it doesn't simply counteract --as-needed), but
have yet to find a resource even close to as helpful for LDFLAGS in
general as "man gcc" is for CFLAGS, so I'm left asking about them one at
a time as I see them, and I've not seen those yet. Again, a discussion
with enough info to make an informed decision would be invaluable! I
don't even know what the options are at this point, and that's
frustrating!
>> FEATURES="ccache distlocks fixpackages metadata-transfer parallel-fetch
>> sandbox sfperms strict unmerge-orphans userfetch"
>
> if you experience collision problems add collision-protect to the
> features. you should have some better protection between packages and
> should avoid packages from writing files owned by other packages. this
> could provoke some more emerge errors than before because there are
> quite some packages that overwrite files
Agreed. It's worth mentioning COLLISION_IGNORE, which you can set in
make.conf to avoid specific known collisions, especially if they are
known to be "safe", that is, something like icons or the like (using an
example from the recent KDE4-svn overlay work, upstream was changing them
fast enough to make it not worth filing bugs with Gentoo over), that it
really doesn't matter if a couple of packages are fighting over.
Differing config files, OTOH, you probably want to know about!
>> MAKEOPTS="-j3"
>
>
> set MAKEOPTS="-j3" to MAKEOPTS="-j5 -s ". if you experience compile
> problem decrease the j number to 1.
Note that there are occasional parallel make issues with -jX, where X>1
or not existent (-j by itself indicates unlimited jobs). These don't
occur frequently at -j2 as it's so heavily used by Gentooers everywhere
so those occurring there tend to have been eliminated already. However,
as the number of jobs increase, so does the likelihood of running into
parallel make issues. make errors indicating file or directory not found
are the classic example, so any time you see one of those, try
MAKEOPTS=-j1 emerge <whatever> and you'll likely find it gone. If so,
file a bug with [parallel make] as well as the package name in the
subject line (assuming there's not yet one filed), and help get it fixed!
=8^)
FWIW, I run MAKEOPTS="-j20 -l12" (up to 20 jobs, but don't start any more
if the load is above 12) here. Dual-dual-core Opteron 290s, 8 gig RAM,
PORTAGE_TMPDIR pointing at a tmpfs so all those temp files don't hit
disk. With PORTAGE_NICENESS=19, and with kernel 2.6.24's new per-user
scheduling turned on (and with Hz=300, voluntary preemption, so it's not
a special real-time kernel or anything by any stretch), I can still
listen to streaming audio and even run a moderately sized visualization
window (amarok with the scrolling voice-print analyzer going in its own
window) without audio pause or undue jerkiness (there's a bit, but under
a load of 3 per core and no real-time preemption, one might expect that)
of the visualization. If I'm feeling adventurous and don't have
anything else critical going on, I'll occasionally try -j by itself, just
to see how high I can run the load average. A 400-500 load average is
doable with kernel compiles, for instance, and nicely entertaining! =8^)
It still amazes me how well the system copes with even that, and it was
even more amazing back with dual-single-cores (242s)!
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
--
gentoo-amd64@g.o mailing list
|