Gentoo Archives: gentoo-dev

From: John Nilsson <john@×××××××.nu>
To: "Robin H. Johnson" <robbat2@g.o>
Cc: John Nilsson <john@×××××××.nu>, Gentoo Developers <gentoo-dev@g.o>
Subject: Re: [gentoo-dev] CFLAGS moved to ebuilds.
Date: Wed, 10 Dec 2003 18:38:31
Message-Id: 20031211003834.GC28666@newkid
In Reply to: Re: [gentoo-dev] CFLAGS moved to ebuilds. by "Robin H. Johnson"
1 On 12/10/03 02:15:02, Robin H. Johnson wrote:
2 > On Tue, Dec 09, 2003 at 11:17:23PM +0100, John Nilsson wrote:
3 > > >Thats the express purpose that genflags was created for, to
4 > provide
5 > > >users with a known good set of high-performance CFLAGS so they
6 > didn't
7 > > >need to mess around with it too much.
8 > > Still there is no room for improvement when dealing with system
9 > wide
10 >
11 > > optimization.
12 > Your wording here is unclear, I think you mean to say that there IS
13 > room for
14 > improvement over system wide constant CFLAGS?
15
16 English is not my native language. Writing this GLEP made me realize I
17 have problems expressing my thoughts in ANY language =). If this GLEP
18 is going to survive, comments on formulations and wordings (and
19 spelling) is greatly appreciated.
20
21 Yes I meant that it is next to impossible to find a system wide
22 optimization beyond "-march=<arch> -O2 -pipe" fit for the majority of
23 users.
24
25
26 > > >strip-flags to remove problematic flags on a per ebuild basis is
27 > the
28 > > >best solution. I do agree that unstable gcc settings are a big
29 > > >problem, eg in a recent bug it turned out the submitter's system
30 > (an
31 > > >older Pentium I) couldn't handle -O3 without flaking out. Reduce
32 > it
33 > > >to -O2 and the box went fine (both for compiling and already
34 > compiled
35 > > >packages).
36 > > This is a bug in GCC. While a workaround may be a quick solution
37 > for
38 >
39 > > Gentoo, one shouldn't base the whole system on bugs.
40 > No it isn't a bug in GCC, it's a bug with the user's specific
41 > hardware.
42 > I have an older Pentium I system that runs just fine with -O3 and the
43 > user's specified CFLAGS. I didn't force everybody to use -O2, I just
44 > got
45 > that user to change his own system down to -O2.
46
47 I miss understood you. Still this is a bug in a specific cpu. You cant
48 guarantee stability in any case if the hardware is broken.
49
50 > > >again, genflags was created for this. I've considered a sequel to
51 > > >genflags based on the genetic optimization of compiler flags as
52 > > >mentioned on Slashdot a while ago, but for lack of time, i'm not
53 > even
54 > > >looking at doing it now.
55 > > You might want to chek:
56 > > http://www.coyotegulch.com/potential/gccga/gccga.html
57 > This is the original item I was referencing, but you still run into
58 > the
59 > problem that you need to run things on a system basis to get
60 > effective
61 > results.
62
63 Yeah, I had the page open when I read your mail so I though I'd spare
64 you the trouble of looking it up =)
65
66 > http://www.coyotegulch.com/acovea/index.html is the rest of the
67 > article,
68 >
69 > > http://www.rocklinux.net/packages/ccbench.html
70 > This basically brute forces the genetic algorithms, with absolutely
71 > no
72 > thought as to the net effects on the results of the given flags, eg,
73 > on
74 > my home server (an AthlonXP 2400+), it returns these results:
75 > gcc -O3 -march=athlon -fomit-frame-pointer -funroll-loops
76 > -frerun-loop-opt -funroll-all-loops -fschedule-insns
77 >
78 > Of that, '-frerun-loop-opt' and '-fschedule-insns' are redundant as
79 > they
80 > are implied by -O3.
81 >
82 > -fomit-frame-pointer and I can't debug code properly anymore, and if
83 > I
84 > try to use -funroll-all-loops to compile mysql, even with it's
85 > --with-low-memory option, gcc wants 600mb of memory to compile it's
86 > sql_yacc.cc.
87
88 I had the same reaction. ccbench was what made me realize that any kind
89 of systemwide optimization is only guesswork (often bad such).
90
91
92
93 > > I meant by evolution: the process of users submiting patches to
94 > improve
95 > > individual ebuilds.
96 > What improves the performance of a given application on one machine
97 > does
98 > NOT nessicary improve it on another machine.
99
100 True, but you would have much better situation to test that fact, then
101 what wa have now.
102
103
104 > Read the gcc manpage and see:
105 > -fprofile-arcs
106 > -fbranch-probabilities
107 > (also read http://gcc.gnu.org/news/profiledriven.html)
108 >
109 > Just adding these to ccbench doubles the amount of time taken to
110 > test (as you must compile with -fprofile-arcs, run, compile with
111 > -fbranch-probabilities, run again). It also provides some extremely
112 > interesting and varying results. The bubblesort test for example,
113 > improves between +15% and +300% depending on the other compiler
114 > flags.
115 > Towers of Hanoi goes from -20% to +50%.
116 >
117 > If users submitted _good_ non-interactive testcases for every ebuild,
118 > it
119 > wouldn't difficult to apply -fprofile-arcs/branch-probabilities and
120 > or
121 > acovea to most packages at all, apart from the massive increase in
122 > compile time.
123
124 Couldn't one save the profile data in the portage tree once a generic
125 usecase was found?
126
127
128 > > >Stable and high-performance is an per-system definition, as
129 > evidenced
130 > > >by the bug I mentioned with -O3.
131 > > And should as such be fixed... in gcc. If gcc cant optimize correct
132 >
133 > > knowing the cache size of the cpu, gcc is broken. Fix gcc.
134 > Again, it isn't a gcc bug, it's an issue with a specific machine (not
135 > even a class of systems or cpus).
136 >
137 > Lets take a tangent on this whole issue for a moment. Ignoring the
138 > implementation concerns, the end goal of your GLEP is this:
139 > The basic gain you want, is for the support of per-package CFLAG
140 > modifications (inside the ebuilds), for the purpose of performance
141 > optimization.
142 >
143 > Do I have this correct?
144
145 Yes pleace ignore implementation details, they whre only provided as an
146 alternative example scenario, Very open for discussion =)
147
148 The goal is not the speed as such, but the testability of it. I want to
149 move from the current situation where you have absolutley no knowlege
150 of the optimzation results to a situation where you would actually be
151 able to give evidence of improvments or the reverse.
152
153 Reusability of cflags if you wish =)
154
155
156
157 /John