[gentoo-amd64] Re: Re: Re: Wow! KDE 3.5.1 & Xorg 7.0 w/ Composite - gentoo-amd64

From:	Duncan <1i5t5.duncan@×××.net>
To:	gentoo-amd64@l.g.o
Subject:	[gentoo-amd64] Re: Re: Re: Wow! KDE 3.5.1 & Xorg 7.0 w/ Composite
Date:	Thu, 09 Feb 2006 00:20:28
Message-Id:	`pan.2006.02.09.00.17.14.495666@cox.net`
In Reply to:	Re: [gentoo-amd64] Re: Re: Wow! KDE 3.5.1 & Xorg 7.0 w/ Composite by Simon Stelling

1

Simon Stelling posted <43EA568D.6020307@g.o>, excerpted below,  on

2

Wed, 08 Feb 2006 21:37:33 +0100:

3

4

> Duncan wrote:

5

6

>> I should really create a page listing all the little Gentoo admin scripts

7

>> I've come up with and how I use them.  I'm sure a few folks anyway would

8

>> likely find them useful.

9

>>

10

>> The idea behind most of them is to create shortcuts to having to type in

11

>> long emerge lines, with all sorts of arbitrary command line parameters.

12

>> The majority of these fall into two categories, ea* and ep*, short for

13

>> emerge --ask <additional parameters> and emerge --pretend ... .  Thus, I

14

>> have epworld and eaworld, the pretend and ask versions of emerge -NuDv

15

>> world, epsys and easys, the same for system, eplog <package>, emerge

16

>> --pretend --log --verbose (package name to be added to the command line so

17

>> eplog gcc, for instance, to see the changes between my current and the new

18

>> version of gcc), eptree <package>, to use the tree output, etc.

19

>

20

> Interesting. But why do you use scripts and not simple aliases? Every time you

21

> launch your script the HD performs a seek (which is very expensive in time),

22

> copies the script into memory and then forks a whole bash process to execute a

23

> one-liner. Using alias, which is a bash built-in, wouldn't fork a process and

24

> therefore be much faster.

25

26

My thinking, which is possibly incorrect (your input appreciated), is that

27

file-based scripts get pulled into cache the first time they are executed,

28

and will remain there (with a gig of memory) pretty much until I'm done

29

doing my upgrades.  At the same time, they are simply in cache, not

30

something in bash's memory, so if the memory is needed, it will be

31

reclaimed.  As well, after I'm done and on to other tasks, the cached

32

commands will eventually be replaced by other data, if need be.

33

34

Aliases (and bash-functions) are held in memory.  That's not as flexible

35

as cache in terms of being knocked out of memory if the memory is needed

36

by other things.  Sure, that memory may be flushed to disk-based swap, but

37

that's disk based the same as the actual script files  I'm using, so

38

reading it back into main memory if it's faulted out will take something

39

comparable to the time it'd take to read in the script file again anyway. 

40

That's little gain, with the additional overhead and therefore loss of

41

having to manage the temp-copy in swapped memory, if it comes to that.

42

43

Actually, there are some details here that may affect things.  I don't

44

know enough about the following factors to be able to evaluate how they

45

balance out, but the real reason I chose individual scripts is below.

46

47

One, here anyway, tho not on most systems, I'm running four SATA disks in

48

RAID.  The swap is actually not on the RAID, as the kernel manages it like

49

RAID on its own, provided all four swap areas are set to the same priority

50

(they are), which means swap is running on the equivalent of

51

four-way-striped RAID-0.  Meanwhile, the scripts, as part of my main

52

system, are on RAID-6 for redundancy, so with the same four disks backing

53

the RAID-6 as the swap, I've only effectively two-way-striped storage

54

there, the other two disk stripes being parity.  Thus, retrieval from the

55

4-way-striped swap should in theory be more efficient than from the

56

2-way-striped regular storage.  OTOH, the granularity of the stripe

57

in either case, against the size of the one or two-line script, likely

58

means that it'll be pulled from a single stripe (at the speed of

59

reading from a single disk, tho there are parallelizing opportunities

60

not available on a single disk).  It's also likely that the swap will be

61

more optimally managed for fast retrieval than the location on the regular

62

filesystem is.  Balanced against that we have the overhead of maintaining

63

the swap tracking.

64

65

That's assuming it would swap that out to the dedicated swap in the first

66

place.  I'm not familiar with Linux's VM, but given that the aliases and

67

functions would be file-based in either case, it's possible it would

68

simply drop the data from main memory, relying on the fact that that the

69

data is clean file-backed data and could be read-in directly from the

70

files again, if necessary, rather than bothering with actually creating a

71

temporary copy of the /same/ data in swap, taking time to do so when it

72

could just read it back in from the file.

73

74

Another aspect is the effect of data vs metadata caching.  Again, I'm not

75

familiar with how Linux manages this, and indeed, it may differ between

76

filesystems, but the idea is that if the file metadata is still cached,

77

even if the file itself isn't, it's a single disk seek and read to read

78

the data back in, as opposed to multiple seeks and reads, following the

79

logical directory structure to fetch each directory table in the

80

hierarchy until it reaches the entry that actually has the file location,

81

before it can read the file itself, to read the file initially, or if the

82

location metadata has been flushed as well.  (Back several years ago on

83

MSWormOS, one of the first things I always did after a reinstall was set

84

the system to server profile, which kept a far larger metadata cache, on

85

the theory that the metadata was usually smaller than the data, and for

86

dirs, sharable among many data files, so I'd rather spend cache memory on

87

metadata than data.  The other choices were the default desktop profile,

88

and laptop, a much smaller metadata cache.  I originally learned about

89

these as a result of reading about a bug in the original 95 as shipped,

90

that swapped some entries in the registry, and therefore cached FAR less

91

metadata than it should have. I don't know where these tweaks are located

92

on Linux, or how to go about adjusting them safely.)

93

94

Basically, therefore, I don't believe aliases to be a big positive, and

95

possibly somewhat of a negative, as opposed to scripts, because the

96

scripts will be cached in most cases after initial use anyway, yet they

97

have the advantage of not having to be maintained or tracked in memory

98

when I'm doing other tasks and the system needs that cache.

99

100

Given that I don't believe it's a big positive, I prefer the

101

administrative convenience and maintainability of separate scripts.

102

103

There /is/ a third alternative, that I came across recently, that I think

104

is a good idea.  If you'd coomment, perhaps it would help me sort out the

105

implications.

106

107

The idea, simply put, is "bash command theming", single scripts that can

108

be invoked that will "theme" a command prompt for the tasks at hand.  I

109

didn't read the entire article I saw covering this, but skimmed it enough

110

to get the gist.  A single invokable script for each set of tasks, say

111

perl programming, bash programming, working with portage, etc, that would

112

set up a specific set of aliases and functions for that task.  Invoking

113

the script with the "off" parameter would erase that set of aliases and

114

bash functions, thereby recovering the memory, and do any related cleanup

115

like resetting the path if necessary to exclude any task specific

116

commands.  Taking this a step further, a variable could be setup that

117

would list the theme or themes that were active, that the theme-setup

118

script could read and automatically deactivate the previous theme while

119

switching to the new one.  One could even share functionality between

120

themes, sourcing common files, which would check the active theme and

121

adjust their behavior based on the active theme.

122

123

This alias and function theming wouldn't be quite as modular (tho with

124

sourcing it could be) as the individual scripts, but would maintain the

125

performance advantages (if any) of the alias/function idea, while at the

126

same time allowing the memory reclamation of the cached-script option.  It

127

sounds really good, but I'm not yet convinced the benefits would be worth

128

the additional effort of setting up those themes, since the solution I

129

have works.

130

131

One VERY NICE benefit of the themes idea is that it would directly

132

address any namespace pollution concerns.  It has a direct appeal to

133

programmers and anyone else that's ever had to deal with such issues, for

134

that reason alone.  One single command on the path to invoke the theme,

135

possibly even an eselect-like command shared among themes, with

136

everything else off-path and out of the namespace unless that theme is

137

invoked!  /VERY/ appealing indeed.  OTOH, there are those who'll never

138

remember the theme they have active at the moment, and be constantly

139

confused.  For these folks, it'd be a nightmare!

140

141

> man emerge:

142

>        --oneshot (-1)

143

>

144

> IIRC --oneshot has a short form since 2.0.52 was released.

145

146

Learn new things everyday.  Thanks!  I remember how pleased I was to have

147

--newuse, and even more so when I discovered -N, so very nice!

148

149

>> ...  Deep breath... <g>

150

>>

151

>> All that as a preliminary explanation to this:  Along with the above, I

152

>> have a set of efetch functions, that invoke the -f form, so just do the

153

>> fetch, not the actual compile and merge, and esyn (there's already an

154

>> esync function in something or other I have merged so I just call it

155

>> esyn), which does emerge sync, then updates the esearch db, then

156

>> automatically fetches all the packages that an eaworld would want to

157

>> update, so they are ready for me to merge at my leisure.

158

>

159

> I'm a bit confused now. You use *functions* to do that? Or do you mean

160

> scripts? By the way: with alias you could name your custom "script"

161

> esync because it doesn't place a file on the harddisk.

162

163

Scripts.  I was using "functions" in the generic sense here.  I did

164

realize before I sent that it had a dual meaning, but figured it wasn't

165

important enough a distinction to go back and correct, or explain. 

166

Unfortunately, every time I decide to skip something like that, I get

167

called on it, which doesn't help my posts get any shorter! =8^)

168

169

>> I choose -Os, optimize for size, because a modern CPU and the various

170

>> cache levels are FAR faster than main memory.

171

>

172

> Given the fact that two CPUs, only differing in L2 Cache size, have

173

> nearly the same performance, I doubt that the performance increase is

174

> very big. Some interesting figures:

175

>

176

> Athlon64 something (forgot what, but shouldn't matter anyway) with 1 MB

177

> L2-cache is 4% faster than an Athlon64 of the same frequency but with only 512kB

178

> L2-cache. The bigger the cache sizes you compare get, the smaller the

179

> performance increase. Since you run a dual Opteron system with 1 MB L2

180

> cache per CPU I tend to say that the actual performance increase you

181

> experience is about 3%. But then I didn't take into account that -Os

182

> leaves out a few optimizations which would be included by -O2, the

183

> default optimization level, which actually makes the code a bit slower

184

> when compared to -O2. So, the performance increase you really experience

185

> shrinks to about 0-2%. I'd tend to proclaim that -O2 is even faster for

186

> most of the code, but that's only my feeling.

187

188

Interesting, indeed.  I'd counter that it likely has to do with how many

189

tasks are being juggled as well, plus the number of kernel/user context

190

switches, of course.  I wonder under what load, and with what task-type,

191

the above 4% difference was measured.

192

193

Of course, the definitive way to end the argument would be to do some

194

profiling and get some hard numbers, but I don't think either you or I

195

consider it an important enough factor in our lives to go to /that/ sort

196

of trouble. <g>

197

198

> Beside that I should mention that -Os sometimes still has problems with

199

> huge packages like glibc.

200

201

Interestingly enough, while Gentoo's glibc ebuilds stripflags to -O2, I

202

did try it with all that stripflags logic disabled.  For glibc, it /does/

203

seem to slow things down, or did back with gcc-3.3 (IIRC) anyway.  I tried

204

the same glibc both ways.  I would have tried tinkering further, but

205

decided it wasn't worth complicating debugging and the like, since glibc

206

is loaded by virtually everything, and I'd never be able to tell if it was

207

my funny tweaks to glibc, or some actual issue with whatever package. 

208

Besides, that's an aweful costly package, in terms of recompile time, not

209

to mention system stability, to be experimenting with.  I /can/ say,

210

however, that it didn't crash or cause any other issues I could see or

211

attribute to it.

212

213

OTOH, I haven't tried it with xorg-modular yet, but the monolithic xorg

214

builds seemed to perform better with -Os.  I tried one of them (6.8??)

215

both ways too.  I ended up routinely killing the stripflags logic, but I

216

was modifying other portions of the ebuild as well (so it compiled only

217

the ATI video driver, and only installed the 100-dpi fonts, not 75-dpi,

218

among other things), so that was just one of several modifications I was

219

making, tho the only real performance affecting one. Performance in X was

220

better, but it DID take longer to switch to a VT, when I tried that.  In

221

fact, at one point, the switch to VT functionality broke, but someone

222

mentioned it was broken in general at that point for certain drivers,

223

anyway, so I'm not sure my optimizations had anything to do with it.

224

225

>> Of course, this is theory, and the practical case can and will differ

226

>> depending on the instructions actually being compiled.  In particular,

227

>> streaming media apps and media encoding/decoding are likely to still

228

>> benefit from the traditional loop elimination style optimizations,

229

>> because they run thru so much data already, that cache is routinely

230

>> trashed anyway, regardless of the size of your instructions.  As well,

231

>> that type of application tends to have a LOT of looping instructions to

232

>> optimize!

233

>>

234

>> By contrast, something like the kernel will benefit more than usual

235

>> from size optimization.  First, it's always memory locked and as such

236

>> can't be swapped, and even "slow" main memory is still **MANY**

237

>> **MANY** times faster than swap, so a smaller kernel means more other

238

>> stuff fits into main memory with it, and isn't swapped as much. Second,

239

>> parts of the

240

>

241

> Funny to hear this from somebody with 4 GB RAM in his system. I don't

242

> know how bloated your kernel is, but even if -Os would reduce the size

243

> of my kernel to **the half**, which is totally impossible, it wouldn't

244

> be enough to load the mail I am just answering into RAM. So, basically,

245

> this reasoning is just ridiculous.

246

247

I won't argue with that.  BTW, still at a gig, much to my frustration!  I

248

put off upgrading memory when I decided my disk was in danger of going bad

249

and I ended up deciding to go 4-disk SATA based RAID.  Then I upgraded my

250

stereo near Christmas...  Now the CC is almost paid off again, so I'm

251

looking at that memory upgrade again.

252

253

Much to my frustration, memory prices don't seem to be dropping much

254

lately!

255

256

> You are referring a lot to the gcc manpage, but obviously you missed

257

> this part:

258

>

259

>        -fomit-frame-pointer

260

>            Don't keep the frame pointer in a register for functions that

261

>            don't need one.  This avoids the instructions to save, set up

262

>            and restore frame pointers; it also makes an extra register

263

>            available in many functions.  It also makes debugging

264

>            impossible on some machines.

265

>

266

>            On some machines, such as the VAX, this flag has no effect,

267

>            because the standard calling sequence automatically handles

268

>            the frame pointer and nothing is saved by pretending it

269

>            doesn't exist.  The machine-description macro

270

>            "FRAME_POINTER_REQUIRED" controls whether a target machine

271

>            supports this flag.

272

>

273

>            Enabled at levels -O, -O2, -O3, -Os.

274

>

275

> I have to say that I am a bit disappointed now. You seemed to be one of

276

> those people who actually inform themselves before sticking new flags

277

> into their CFLAGS.

278

279

??

280

281

I'm not sure which way you mean this.  It was in my CFLAGS list, but I

282

didn't discuss it as it's fairly common (from my observation, nearly as

283

common as -pipe) and seems fairly non-controversial on Gentoo.  Did you

284

miss it in my CFLAGS and are saying I should be using it, or did you see

285

it and are saying its unnecessary and redundant because it's enabled by

286

the -Os?

287

288

If the latter, yes, but as mentioned above in the context of glibc, -Os is

289

sometimes stripped.  In that case, the redundancy of having the basic

290

-fomit-frame-pointer is useful, unless it's also stripped, but as I said,

291

it seems much less controversial than some flags and is often

292

specifically allowed where most are stripped.

293

294

Or, are you saying I should avoid it due to the debugging implications?  I

295

don't quite get it.

296

297

>> !!! Relying on the shell to locate gcc, this may break !!! DISTCC,

298

>> installing gcc-config and setting your current gcc !!! profile will fix

299

>> this

300

>>

301

>> Another warning, likewise to stderr and thus not in the eis output.

302

>> This one is due to the fact that eselect, the eventual systemwide

303

>> replacement for gcc-config and a number of other commands, uses a

304

>> different method to set the compiler than gcc-config did, and portage

305

>> hasn't been adjusted to full compatibility just yet.  Portage finds the

306

>> proper gcc just fine for itself, but there'd be problems if distcc was

307

>> involved, thus the warning.

308

>

309

> Didn't know about this. Have you filed a bug yet on the topic? Or is

310

> there already one?

311

312

There is one.  I don't recall if I filed it or if it was already there,

313

but both JH and the portage folks know about the issue.  IIRC, the portage

314

folks decided it was their side that needed changed, but that required

315

changes to the distcc package, and I don't know how that has gone since I

316

don't use distcc, except that I was slightly surprised to see the warning

317

in portage 2.1 still.

318

319

>> MAKEOPTS="-j4"

320

>>

321

>> The four jobs is nice for a dual-CPU system -- when it works.

322

>> Unfortunately, the unpack and configure steps are serialized, so the

323

>> jobs option does little good, there.  To make most efficient use of the

324

>> available cycles when I have a lot to merge, therefore, I'll run as

325

>> many as five merges in parallel.  I do this quite regularly with KDE

326

>> upgrades like the one to 3.5.1, where I use the split KDE ebuilds and

327

>> have something north of 100 packages to merge before KDE is fully

328

>> upgraded.

329

>

330

> I really wonder how you would paralellize unpacking and configuring a

331

> package.

332

333

That's what was nice about configcache, which was supposed to be in the

334

next portage, but I haven't seen or heard anything about it for awhile,

335

and the next portage, 2.1, is what I'm using.  configcache seriously

336

shortened that stage of the build, leaving more of it parallelized, but...

337

338

I was using it for awhile, patching successive versions of portage, but it

339

broke about the time sandbox split, the dev said he wasn't maintaining the

340

old version since it was going in the new portage, and I tried updating

341

the patch but eventually ran into what I think were unrelated issues but

342

decided to drop that in one of my troubleshooting steps and never picked

343

it up again.

344

345

I'd certainly like to have it back again, tho.  If it's working in 2.1,

346

I've not seen it documented or seen any hints in the emerge output, as

347

were there before.  You seen or heard anything?

348

349

BTW, what is your opinion on -ftracer?  Several devs I've noticed use it,

350

but the manpage says it's not that useful without active profiling, which

351

means compiling, profiling, and recompiling, AFAIK.  It's possible the

352

devs running it do that, but I doubt it, and otherwise, I don't see that

353

it should be that useful?  I don't know if you run it, but since I've got

354

your attention, I thought I'd ask what you think about it.  Is there

355

something of significance I'm missing, or are they, or are they actually

356

doing that compile/profile/recompile thing?  It just doesn't make sense to

357

me.  I've seen it in several user posted CFLAGS as well, but I'll bet a

358

good portion of them are simply because they saw it in a dev's CFLAGS and

359

decided it looked useful, not because they understand any implications

360

stated in the manpage.  (Not that I always do either, but... <g>)

361

362

--

363

Duncan - List replies preferred.   No HTML msgs.

364

"Every nonfree program has a lord, a master --

365

and if you use the program, he is your master."  Richard Stallman in

366

http://www.linuxdevcenter.com/pub/a/linux/2004/12/22/rms_interview.html

367

368

369

--

370

gentoo-amd64@g.o mailing list

Subject	Author
Re: [gentoo-amd64] Re: Re: Re: Wow! KDE 3.5.1 & Xorg 7.0 w/ Composite	Simon Stelling <blubb@g.o>
Re: [gentoo-amd64] Re: Re: Re: Wow! KDE 3.5.1 & Xorg 7.0 w/ Composite	Simon Stelling <blubb@g.o>
Re: [gentoo-amd64] Re: Re: Re: Wow! KDE 3.5.1 & Xorg 7.0 w/ Composite	Paul de Vrieze <pauldv@g.o>
Re: [gentoo-amd64] Re: Re: Re: Wow! KDE 3.5.1 & Xorg 7.0 w/ Composite	Bob Sanders <rsanders@×××.com>
Re: [gentoo-amd64] Re: Re: Re: Wow! KDE 3.5.1 & Xorg 7.0 w/ Composite	Bernhard Auzinger <e0026053@×××××××××××××××××.at>

Gentoo Archives: gentoo-amd64

Replies