Gentoo Archives: gentoo-dev

From: Andrew Savchenko <bircoph@×××××.com>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] Re: LTO use in the tree
Date: Mon, 28 Apr 2014 21:46:35
Message-Id: 20140429014609.6ef75d4c308e6e9f7e67d058@gmail.com
In Reply to: Re: [gentoo-dev] Re: LTO use in the tree by Rich Freeman
1 Hello,
2
3 On Sun, 27 Apr 2014 07:23:11 -0400 Rich Freeman wrote:
4 > And yet, in the same paragraph you mention -O3, which is
5 > tantamount to just setting a flag and walking away. That turns
6 > on 14 things you probably don't really need.
7
8 Why 14 things? According to gcc-4.8.2 manual -O3 enables the
9 following:
10 -finline-functions, -funswitch-loops, -fpredictive-commoning,
11 -fgcse-after-reload, -ftree-vectorize, -fvect-cost-model,
12 -ftree-partial-pre, -fipa-cp-clone.
13 Some of this options triggers another ones, but these 8 things are
14 sufficient to mimic -O3 completely.
15
16 From my experience only three of them are harmful:
17 -finline-functions and -fipa-cp-clone bloat code size significantly
18 hurting performance due to more CPU cache misses.
19 -ftree-vectorize may be used on amd64 (performance boost is in the
20 range -3.. +5%), but is a complete menace on x86: a lot of ICEs and
21 a lot of segfaults due to stack misalignment and even some working
22 but miscompiled code. While some (but not all) stack alignment
23 issues may be fixed with -mstackrealign, this drops performance
24 enhancement to negative values.
25
26 All other -O3 option have either no effect or measurable
27 performance enhancements in the range of several percent.
28
29 Tests were made using multimedia packages (mplayer, ffmpeg, x264)
30 and scientific ones (root, pythia, geant, blas libs).
31
32 Best regards,
33 Andrew Savchenko

Replies

Subject Author
Re: [gentoo-dev] Re: LTO use in the tree Rich Freeman <rich0@g.o>