Gentoo Archives: gentoo-dev

From:	John Nilsson <john@×××××××.nu>
To:	"Robin H. Johnson" <robbat2@g.o>
Cc:	John Nilsson <john@×××××××.nu>, Gentoo Developers <gentoo-dev@g.o>
Subject:	Re: [gentoo-dev] CFLAGS moved to ebuilds.
Date:	Wed, 10 Dec 2003 18:38:31
Message-Id:	`20031211003834.GC28666@newkid`
In Reply to:	Re: [gentoo-dev] CFLAGS moved to ebuilds. by "Robin H. Johnson"

1	On 12/10/03 02:15:02, Robin H. Johnson wrote:
2	> On Tue, Dec 09, 2003 at 11:17:23PM +0100, John Nilsson wrote:
3	> > >Thats the express purpose that genflags was created for, to
4	> provide
5	> > >users with a known good set of high-performance CFLAGS so they
6	> didn't
7	> > >need to mess around with it too much.
8	> > Still there is no room for improvement when dealing with system
9	> wide
10	>
11	> > optimization.
12	> Your wording here is unclear, I think you mean to say that there IS
13	> room for
14	> improvement over system wide constant CFLAGS?
15
16	English is not my native language. Writing this GLEP made me realize I
17	have problems expressing my thoughts in ANY language =). If this GLEP
18	is going to survive, comments on formulations and wordings (and
19	spelling) is greatly appreciated.
20
21	Yes I meant that it is next to impossible to find a system wide
22	optimization beyond "-march=<arch> -O2 -pipe" fit for the majority of
23	users.
24
25
26	> > >strip-flags to remove problematic flags on a per ebuild basis is
27	> the
28	> > >best solution. I do agree that unstable gcc settings are a big
29	> > >problem, eg in a recent bug it turned out the submitter's system
30	> (an
31	> > >older Pentium I) couldn't handle -O3 without flaking out. Reduce
32	> it
33	> > >to -O2 and the box went fine (both for compiling and already
34	> compiled
35	> > >packages).
36	> > This is a bug in GCC. While a workaround may be a quick solution
37	> for
38	>
39	> > Gentoo, one shouldn't base the whole system on bugs.
40	> No it isn't a bug in GCC, it's a bug with the user's specific
41	> hardware.
42	> I have an older Pentium I system that runs just fine with -O3 and the
43	> user's specified CFLAGS. I didn't force everybody to use -O2, I just
44	> got
45	> that user to change his own system down to -O2.
46
47	I miss understood you. Still this is a bug in a specific cpu. You cant
48	guarantee stability in any case if the hardware is broken.
49
50	> > >again, genflags was created for this. I've considered a sequel to
51	> > >genflags based on the genetic optimization of compiler flags as
52	> > >mentioned on Slashdot a while ago, but for lack of time, i'm not
53	> even
54	> > >looking at doing it now.
55	> > You might want to chek:
56	> > http://www.coyotegulch.com/potential/gccga/gccga.html
57	> This is the original item I was referencing, but you still run into
58	> the
59	> problem that you need to run things on a system basis to get
60	> effective
61	> results.
62
63	Yeah, I had the page open when I read your mail so I though I'd spare
64	you the trouble of looking it up =)
65
66	> http://www.coyotegulch.com/acovea/index.html is the rest of the
67	> article,
68	>
69	> > http://www.rocklinux.net/packages/ccbench.html
70	> This basically brute forces the genetic algorithms, with absolutely
71	> no
72	> thought as to the net effects on the results of the given flags, eg,
73	> on
74	> my home server (an AthlonXP 2400+), it returns these results:
75	> gcc -O3 -march=athlon -fomit-frame-pointer -funroll-loops
76	> -frerun-loop-opt -funroll-all-loops -fschedule-insns
77	>
78	> Of that, '-frerun-loop-opt' and '-fschedule-insns' are redundant as
79	> they
80	> are implied by -O3.
81	>
82	> -fomit-frame-pointer and I can't debug code properly anymore, and if
83	> I
84	> try to use -funroll-all-loops to compile mysql, even with it's
85	> --with-low-memory option, gcc wants 600mb of memory to compile it's
86	> sql_yacc.cc.
87
88	I had the same reaction. ccbench was what made me realize that any kind
89	of systemwide optimization is only guesswork (often bad such).
90
91
92
93	> > I meant by evolution: the process of users submiting patches to
94	> improve
95	> > individual ebuilds.
96	> What improves the performance of a given application on one machine
97	> does
98	> NOT nessicary improve it on another machine.
99
100	True, but you would have much better situation to test that fact, then
101	what wa have now.
102
103
104	> Read the gcc manpage and see:
105	> -fprofile-arcs
106	> -fbranch-probabilities
107	> (also read http://gcc.gnu.org/news/profiledriven.html)
108	>
109	> Just adding these to ccbench doubles the amount of time taken to
110	> test (as you must compile with -fprofile-arcs, run, compile with
111	> -fbranch-probabilities, run again). It also provides some extremely
112	> interesting and varying results. The bubblesort test for example,
113	> improves between +15% and +300% depending on the other compiler
114	> flags.
115	> Towers of Hanoi goes from -20% to +50%.
116	>
117	> If users submitted _good_ non-interactive testcases for every ebuild,
118	> it
119	> wouldn't difficult to apply -fprofile-arcs/branch-probabilities and
120	> or
121	> acovea to most packages at all, apart from the massive increase in
122	> compile time.
123
124	Couldn't one save the profile data in the portage tree once a generic
125	usecase was found?
126
127
128	> > >Stable and high-performance is an per-system definition, as
129	> evidenced
130	> > >by the bug I mentioned with -O3.
131	> > And should as such be fixed... in gcc. If gcc cant optimize correct
132	>
133	> > knowing the cache size of the cpu, gcc is broken. Fix gcc.
134	> Again, it isn't a gcc bug, it's an issue with a specific machine (not
135	> even a class of systems or cpus).
136	>
137	> Lets take a tangent on this whole issue for a moment. Ignoring the
138	> implementation concerns, the end goal of your GLEP is this:
139	> The basic gain you want, is for the support of per-package CFLAG
140	> modifications (inside the ebuilds), for the purpose of performance
141	> optimization.
142	>
143	> Do I have this correct?
144
145	Yes pleace ignore implementation details, they whre only provided as an
146	alternative example scenario, Very open for discussion =)
147
148	The goal is not the speed as such, but the testability of it. I want to
149	move from the current situation where you have absolutley no knowlege
150	of the optimzation results to a situation where you would actually be
151	able to give evidence of improvments or the reverse.
152
153	Reusability of cflags if you wish =)
154
155
156
157	/John

Report Message

Find on MARC Find on Google Groups