Re: [gentoo-dev] "Distro Day (Measuring the benefits of the Gentoo approach)" - gentoo-dev

From:	Chris Gianelloni <wolf31o2@g.o>
To:	billk@×××××××××.au
Cc:	frlinux@g.o, gentoo-dev ML <gentoo-dev@g.o>
Subject:	Re: [gentoo-dev] "Distro Day (Measuring the benefits of the Gentoo approach)"
Date:	Thu, 14 Aug 2003 12:27:01
Message-Id:	`1060864255.19236.147.camel@vertigo`
In Reply to:	Re: [gentoo-dev] "Distro Day (Measuring the benefits of the Gentoo approach)" by William Kenworthy

1

On Wed, 2003-08-13 at 18:49, William Kenworthy wrote:

2

> I'll stick my hand up and say I was the person who installed gentoo for

3

> this test.  For those who made the previous posts (mostly crap, and who

4

> dont seem to have read the article very well - though it could have been

5

> more informative), perhaps a few facts may help:

6

>

7

> 1. was fully bootstrapped and compiled as stage 1/2/3 on the machine -

8

> not a binary install

9

10

Great.  I read the article and found no mention of the USE flags

11

employed.  I think you should have honestly posted any information on

12

things you changed.

13

14

> 2. gentoo-sources 2.4.20 was used - Mandrake came with a newer kernel

15

> than gentoo's reccomended one (still does), debian was a dogs breakfast

16

> because stable is so old.  We actually tried to put the gentoo kernel on

17

> mandrake/debian when tracking down the ide cable prob, but got too hard

18

> - not the way some posts tried to imply)

19

20

Were preemption and low latency turned on?  Was the kernel compiled with

21

the >gcc31 selection for the CPU?  Better yet, why not post the .config

22

from the 3 kernels?

23

24

> 3. optimisations were EXACTLY as recommended by both the make.conf

25

> entries, which were supported by the cflags from the forum for this cpu:

26

> a 2G celery (P4 based core)  I am not sure now, but I believe I ran

27

> prelink as well (to match mandrake) - need to find and check the notes.

28

> 4. Gnumerics problems have been identified and come down to the

29

> particular version - is fixed in the upcoming stable release even before

30

> this was found, but the project was unaware that what they believed was

31

> a slightly slower mod in this version, could be so bad on particular

32

> data sets - i.e., 30 odd mins in 1.0.13, but is less that 30s on 1.0.19

33

> on my laptop

34

35

I hope you only used optimizations listed in the forums for the actual

36

version of GCC you're running.  From the sounds of it, you did not since

37

you used pentium3 and the pentium4 problems were fixed in the most

38

recent stable GCC.  You also should have definitely used a "default"

39

Gentoo install with no changes made.  The default profile setup would

40

have been used instead.  Your optimizations could have been researched

41

from GCC rather than taking the word of a bunch of "armchair compiler

42

experts" on the forums.  No offense meant to anyone, but you mention

43

below that you do much scientific work, yet followed a very poor

44

scientific model and research documentation for this article, which is

45

why it has been torn apart so adamantly.  Had you given out all of the

46

information, even if it were simply links to the files from within the

47

article, it would have given your article much more credibility.

48

49

> There seems to be quite a few myths about this test and people upset

50

> that months were not spent tuning gentoo and every effort made to

51

> cripple the competition! (one person even suggested the faulty ide cable

52

> should have been left in the debian box, as that was the way it was

53

> delivered!)  Read the article, and if you need extra information to

54

> reproduce it, email me or or the author (Indy).  It is reproducable - if

55

> you can obtain the same hardware - I would be very interested if someone

56

> has this and the time to really go into the why these results occurred

57

> in more detail than I had the chance to.

58

59

The same machine should have been used for the testing, rather than

60

three machines.  This alone is reason enough to discount your data. 

61

Three different machines WILL have three different levels of

62

performance.

63

64

> and why was this the result?  Daniel Robbins suggested on this list that

65

> gentoo-sources may be the problem, but tests on another machine (we had

66

> the trial machines for only a couple of days, all of which time was used

67

> to build gentoo right up until I ctrl-c'd the OO build so we could do

68

> the tests before handing the hardware back) showed that turning off

69

> pre-empt and low-latency had zero effect, but changing to an open-mosix

70

> kernel 2.4.20  was ~10% slower (no thread export).  It seemed to come

71

72

I agree with Daniel on some of this.  The default Gentoo kernel is not

73

the fastest out there, it is the most feature rich to meet the various

74

needs of our user base.  I do agree that this kernel should have been

75

used rather than any other.  Also, preempt and the low-latency are

76

interactivity increases, not raw performance increases.  Their

77

modifications are not easily quantifiable.  If you want to test them, I

78

suggest you look into ConTest

79

(http://members.optusnet.com.au/ckolivas/kernel/) which was designed for

80

testing this sort of thing.

81

82

> down to the fact we used -O3 instead of -O2 (think spider might have

83

> suggested this ?)- in effect over-optimised, and we didnt have a chance

84

> to correct. From my perspective, most of the "he should have used ...

85

86

No, you definitely "should have used" -O2 rather than -O3.  Also,

87

-fomit-frame-pointer and -mfpmath=sse would have given dramitic

88

improvements.  I'm not going to go into any other optimizations because

89

the rest are essentially very specific to the hardware/software being

90

used.  I think these are the only "sensible" extra defaults that can be

91

used on a machine with SSE.

92

93

> may actually have made performance even worse! And besides the time

94

> issue, these were supposedly the safe, reccomended flags so we went with

95

> them.  Please note that even Mandrake made only a slight gain on debian,

96

> so 386.586/686 does not make a lot of difference in real world tasks

97

> (the original aim of the tests) - the tests did tasks that particular

98

99

386, 586, 686 make little difference compared to 386, 586, pentium4,

100

which is how it should have been.

101

102

> people used linux for in their day-to-day work - no special tests, so no

103

> special bias.  Yes, I could choose tests that make gentoo shine, or

104

> debian, or windowsXP.  But I dont do those tests every day, whilst that

105

> spreadsheet was/is used as part of my normal work.  And its the same

106

> with the other tests.

107

108

I actually agreed with most of your tests.  You had a hard time being

109

very time constrained.  Honestly, were I in your position, I would not

110

have made this report at all unless I had a MUCH longer time to test

111

things.  You should look into the kinds of testing that many of the

112

hardware sites out there use.  They tend to take WEEKS on a single

113

article.  It doesn't take their full attention that entire time.  After

114

all, there's only so much interaction you need to do when running a

115

script which performs hundreds of actions and logs results to a file.

116

117

> So how many gentoo systems out there have every possible optimisation in

118

> the book, and are actually running slower than ideal?  This is a real

119

120

I use quite a few optimizations, which I benchmarked on my machine with

121

my application/data set and it is the fastest I was able to come up

122

with.  I have actually turned OFF quite a few of the optimizations

123

recommended by many of the "airmchair compiler experts" out there

124

because they either provided little to no improvement or actually

125

decreased performance.  I really don't care if something is 0.001%

126

faster if it takes 400% as long to compile.  Especially being a

127

developer and compiling quite a bit of stuff several times over.

128

129

> problem, and I will be interested in how the cflags projects around

130

> handle this, as most seem to aim at setting the maximum possible flags:

131

> not actually tune the system for the ones that work best/most stably.  A

132

> live benchmark test might be more appropriate.

133

134

I agree 100% here.

135

136

> Most posts on irc and lists have settled down to "he doesnt know what

137

> he's doing" (I do), or the tests were unfair to gentoo (they werent, but

138

> then the same criteria were met by all 3 systems, but with some question

139

> marks over debian because of its mix - some packages had to be compiled

140

> locally, not binary) - but the thrust of the article was not that gentoo

141

> was a dud, but that this was the result within the criteria and time we

142

> were given, not what we expected, so we need to find out why.  Also note

143

> that this was not intentionally a debian/mandrake/gentto distro test.

144

145

Not being able to tune Gentoo essentially means you did not participate

146

in the "Gentoo Approach" but rather kludged it together fairly untuned

147

and pitted against a tuned binary installation and debian.

148

149

> We may be getting a P4 hyperthreaded system to play with, but under

150

> different rules, where I get to do a bit of tuning first.  I have

151

> already built the core system on another machine using gcc-3.2.3,

152

> "-march=pentium4 -O3 -pipe -fomit-frame-pointer"  I note that the

153

> pentium4 warning still appears in make.conf, though I believe it no

154

> longer applies to this gcc.

155

156

It does not apply to the newest stable GCC, so you are correct.

157

158

> A while ago I emailed this list and asked for information on tests and

159

> settings for HT P4's, without a reply.  So again, has anyone done any

160

> tests on a HT P4 and is willing to support the flags they chose as being

161

> "the best"?  In particular, does -ffast-math give a measurable gain?

162

163

There is not much in the way of HT as it is looked at as a SMP machine

164

under Linux.  All you really do is enable SMP and make sure you use ACPI

165

in the kernel.  The default Gentoo kernel does not have many of the HT

166

scheduling changes which have gone into the making of the 2.6_test

167

kernels.  There are backports for these, but I would consider that going

168

a bit overboard, as hand-patching your kernel sources would yield better

169

results on all three systems and should be left alone.  After all,

170

you're wanting to test the results of the three systems, not of your

171

hand-made kernel.  If you were to decide to use another kernel, I would

172

say to use the latest vanilla kernel and possibly the latest 2.6_test

173

kernel on each distribution using the exact same .config to see how much

174

the kernel makes a difference in performance.  You should not use

175

-ffast-math in anything as a default, as it causes math errors which

176

should not be introduced into a stable system.

177

178

> Most of my machines have been built as scientific stations, so accuracy

179

> is more important than ultimate speed, so this is one I have never

180

> tested.  I am not interested in the -O9 -max-everything kiddies who have

181

> been so vocal, but reasoned choices.

182

183

The -O9 kiddies are the "armchair compiler experts" I spoke of earlier. 

184

They have zero real knowledge of compilers and optimizations at all, but

185

have "heard from a friend" or "read on a forum" about it so they think

186

they know it all.  I will gladly admit that I know little about

187

compilers, but I have taken the time to do actual benchmarks on my

188

system to test my various theories and have chosen what I feel to be the

189

best combinations for my own needs.

190

191

--

192

Chris Gianelloni

193

Developer, Gentoo Linux

Gentoo Archives: gentoo-dev

Attachments

Replies

1	On Wed, 2003-08-13 at 18:49, William Kenworthy wrote:
2	> I'll stick my hand up and say I was the person who installed gentoo for
3	> this test. For those who made the previous posts (mostly crap, and who
4	> dont seem to have read the article very well - though it could have been
5	> more informative), perhaps a few facts may help:
6	>
7	> 1. was fully bootstrapped and compiled as stage 1/2/3 on the machine -
8	> not a binary install
9
10	Great. I read the article and found no mention of the USE flags
11	employed. I think you should have honestly posted any information on
12	things you changed.
13
14	> 2. gentoo-sources 2.4.20 was used - Mandrake came with a newer kernel
15	> than gentoo's reccomended one (still does), debian was a dogs breakfast
16	> because stable is so old. We actually tried to put the gentoo kernel on
17	> mandrake/debian when tracking down the ide cable prob, but got too hard
18	> - not the way some posts tried to imply)
19
20	Were preemption and low latency turned on? Was the kernel compiled with
21	the >gcc31 selection for the CPU? Better yet, why not post the .config
22	from the 3 kernels?
23
24	> 3. optimisations were EXACTLY as recommended by both the make.conf
25	> entries, which were supported by the cflags from the forum for this cpu:
26	> a 2G celery (P4 based core) I am not sure now, but I believe I ran
27	> prelink as well (to match mandrake) - need to find and check the notes.
28	> 4. Gnumerics problems have been identified and come down to the
29	> particular version - is fixed in the upcoming stable release even before
30	> this was found, but the project was unaware that what they believed was
31	> a slightly slower mod in this version, could be so bad on particular
32	> data sets - i.e., 30 odd mins in 1.0.13, but is less that 30s on 1.0.19
33	> on my laptop
34
35	I hope you only used optimizations listed in the forums for the actual
36	version of GCC you're running. From the sounds of it, you did not since
37	you used pentium3 and the pentium4 problems were fixed in the most
38	recent stable GCC. You also should have definitely used a "default"
39	Gentoo install with no changes made. The default profile setup would
40	have been used instead. Your optimizations could have been researched
41	from GCC rather than taking the word of a bunch of "armchair compiler
42	experts" on the forums. No offense meant to anyone, but you mention
43	below that you do much scientific work, yet followed a very poor
44	scientific model and research documentation for this article, which is
45	why it has been torn apart so adamantly. Had you given out all of the
46	information, even if it were simply links to the files from within the
47	article, it would have given your article much more credibility.
48
49	> There seems to be quite a few myths about this test and people upset
50	> that months were not spent tuning gentoo and every effort made to
51	> cripple the competition! (one person even suggested the faulty ide cable
52	> should have been left in the debian box, as that was the way it was
53	> delivered!) Read the article, and if you need extra information to
54	> reproduce it, email me or or the author (Indy). It is reproducable - if
55	> you can obtain the same hardware - I would be very interested if someone
56	> has this and the time to really go into the why these results occurred
57	> in more detail than I had the chance to.
58
59	The same machine should have been used for the testing, rather than
60	three machines. This alone is reason enough to discount your data.
61	Three different machines WILL have three different levels of
62	performance.
63
64	> and why was this the result? Daniel Robbins suggested on this list that
65	> gentoo-sources may be the problem, but tests on another machine (we had
66	> the trial machines for only a couple of days, all of which time was used
67	> to build gentoo right up until I ctrl-c'd the OO build so we could do
68	> the tests before handing the hardware back) showed that turning off
69	> pre-empt and low-latency had zero effect, but changing to an open-mosix
70	> kernel 2.4.20 was ~10% slower (no thread export). It seemed to come
71
72	I agree with Daniel on some of this. The default Gentoo kernel is not
73	the fastest out there, it is the most feature rich to meet the various
74	needs of our user base. I do agree that this kernel should have been
75	used rather than any other. Also, preempt and the low-latency are
76	interactivity increases, not raw performance increases. Their
77	modifications are not easily quantifiable. If you want to test them, I
78	suggest you look into ConTest
79	(http://members.optusnet.com.au/ckolivas/kernel/) which was designed for
80	testing this sort of thing.
81
82	> down to the fact we used -O3 instead of -O2 (think spider might have
83	> suggested this ?)- in effect over-optimised, and we didnt have a chance
84	> to correct. From my perspective, most of the "he should have used ...
85
86	No, you definitely "should have used" -O2 rather than -O3. Also,
87	-fomit-frame-pointer and -mfpmath=sse would have given dramitic
88	improvements. I'm not going to go into any other optimizations because
89	the rest are essentially very specific to the hardware/software being
90	used. I think these are the only "sensible" extra defaults that can be
91	used on a machine with SSE.
92
93	> may actually have made performance even worse! And besides the time
94	> issue, these were supposedly the safe, reccomended flags so we went with
95	> them. Please note that even Mandrake made only a slight gain on debian,
96	> so 386.586/686 does not make a lot of difference in real world tasks
97	> (the original aim of the tests) - the tests did tasks that particular
98
99	386, 586, 686 make little difference compared to 386, 586, pentium4,
100	which is how it should have been.
101
102	> people used linux for in their day-to-day work - no special tests, so no
103	> special bias. Yes, I could choose tests that make gentoo shine, or
104	> debian, or windowsXP. But I dont do those tests every day, whilst that
105	> spreadsheet was/is used as part of my normal work. And its the same
106	> with the other tests.
107
108	I actually agreed with most of your tests. You had a hard time being
109	very time constrained. Honestly, were I in your position, I would not
110	have made this report at all unless I had a MUCH longer time to test
111	things. You should look into the kinds of testing that many of the
112	hardware sites out there use. They tend to take WEEKS on a single
113	article. It doesn't take their full attention that entire time. After
114	all, there's only so much interaction you need to do when running a
115	script which performs hundreds of actions and logs results to a file.
116
117	> So how many gentoo systems out there have every possible optimisation in
118	> the book, and are actually running slower than ideal? This is a real
119
120	I use quite a few optimizations, which I benchmarked on my machine with
121	my application/data set and it is the fastest I was able to come up
122	with. I have actually turned OFF quite a few of the optimizations
123	recommended by many of the "airmchair compiler experts" out there
124	because they either provided little to no improvement or actually
125	decreased performance. I really don't care if something is 0.001%
126	faster if it takes 400% as long to compile. Especially being a
127	developer and compiling quite a bit of stuff several times over.
128
129	> problem, and I will be interested in how the cflags projects around
130	> handle this, as most seem to aim at setting the maximum possible flags:
131	> not actually tune the system for the ones that work best/most stably. A
132	> live benchmark test might be more appropriate.
133
134	I agree 100% here.
135
136	> Most posts on irc and lists have settled down to "he doesnt know what
137	> he's doing" (I do), or the tests were unfair to gentoo (they werent, but
138	> then the same criteria were met by all 3 systems, but with some question
139	> marks over debian because of its mix - some packages had to be compiled
140	> locally, not binary) - but the thrust of the article was not that gentoo
141	> was a dud, but that this was the result within the criteria and time we
142	> were given, not what we expected, so we need to find out why. Also note
143	> that this was not intentionally a debian/mandrake/gentto distro test.
144
145	Not being able to tune Gentoo essentially means you did not participate
146	in the "Gentoo Approach" but rather kludged it together fairly untuned
147	and pitted against a tuned binary installation and debian.
148
149	> We may be getting a P4 hyperthreaded system to play with, but under
150	> different rules, where I get to do a bit of tuning first. I have
151	> already built the core system on another machine using gcc-3.2.3,
152	> "-march=pentium4 -O3 -pipe -fomit-frame-pointer" I note that the
153	> pentium4 warning still appears in make.conf, though I believe it no
154	> longer applies to this gcc.
155
156	It does not apply to the newest stable GCC, so you are correct.
157
158	> A while ago I emailed this list and asked for information on tests and
159	> settings for HT P4's, without a reply. So again, has anyone done any
160	> tests on a HT P4 and is willing to support the flags they chose as being
161	> "the best"? In particular, does -ffast-math give a measurable gain?
162
163	There is not much in the way of HT as it is looked at as a SMP machine
164	under Linux. All you really do is enable SMP and make sure you use ACPI
165	in the kernel. The default Gentoo kernel does not have many of the HT
166	scheduling changes which have gone into the making of the 2.6_test
167	kernels. There are backports for these, but I would consider that going
168	a bit overboard, as hand-patching your kernel sources would yield better
169	results on all three systems and should be left alone. After all,
170	you're wanting to test the results of the three systems, not of your
171	hand-made kernel. If you were to decide to use another kernel, I would
172	say to use the latest vanilla kernel and possibly the latest 2.6_test
173	kernel on each distribution using the exact same .config to see how much
174	the kernel makes a difference in performance. You should not use
175	-ffast-math in anything as a default, as it causes math errors which
176	should not be introduced into a stable system.
177
178	> Most of my machines have been built as scientific stations, so accuracy
179	> is more important than ultimate speed, so this is one I have never
180	> tested. I am not interested in the -O9 -max-everything kiddies who have
181	> been so vocal, but reasoned choices.
182
183	The -O9 kiddies are the "armchair compiler experts" I spoke of earlier.
184	They have zero real knowledge of compilers and optimizations at all, but
185	have "heard from a friend" or "read on a forum" about it so they think
186	they know it all. I will gladly admit that I know little about
187	compilers, but I have taken the time to do actual benchmarks on my
188	system to test my various theories and have chosen what I feel to be the
189	best combinations for my own needs.
190
191	--
192	Chris Gianelloni
193	Developer, Gentoo Linux