[gentoo-desktop] Re: System problems - some progress - gentoo-desktop

From:	Duncan <1i5t5.duncan@×××.net>
To:	gentoo-desktop@l.g.o
Subject:	[gentoo-desktop] Re: System problems - some progress
Date:	Fri, 01 Apr 2011 03:24:21
Message-Id:	`pan.2011.04.01.03.22.06@cox.net`
In Reply to:	Re: [gentoo-desktop] Re: System problems - some progress by Lindsay Haisley

1

Lindsay Haisley posted on Sat, 26 Mar 2011 10:57:33 -0500 as excerpted:

2

3

> Yep, I know where you're coming from there.  Iptables isn't all that

4

> hard to understand, and I've become pretty conversant with it in the

5

> process of using for my own and others' systems.  I'd always rather deal

6

> with the "under the hood" CLI tools than with some GUI tool that does

7

> little more than obfuscate the real issue.  That way lies Windows!

8

9

Indeed, the MSWindows way is the GUI way.  But I wasn't even thinking 

10

about that.  I was thinking about the so-called "easier" firewalling CLI/

11

text-editing tools that have you initially answer a number of questions to 

12

setup the basics, then have you edit files to do any "advanced" tweaking 

13

the questions didn't have the foresight to cover.

14

15

But my (first) problem was that while I could answer the questions easy 

16

enough, I lacked sufficient understanding of the real implementation to 

17

properly do the advanced editing.  And if I were to properly dig into 

18

that, I might as well have mastered the IPTables/Netfilter stuff on which 

19

it was ultimately based in the first place.

20

21

The other problem, when building your own kernel, was that the so-called 

22

simpler tools apparently expect all the necessary Netfilter/IPTable kernel 

23

options to be available as pre-built modules (or built-in) -- IOW, they're 

24

designed for the binary distributions where that's the case.  Neither the 

25

questions nor the underlying config file comments mentioned their kernel 

26

module dependencies.  One either had to pre-build them all and hope they 

27

either got auto-loaded as needed, or delve into the scripts to figure out 

28

the dependencies and build/load the required modules.

29

30

Now keep in mind that I first tried this on Mandrake, where I was building 

31

my own kernel within 90 days of first undertaking the switch, while I was 

32

still booting to MS to do mail and news in MSOE, because I hadn't yet had 

33

time to look at user level apps well enough to make my choices and set 

34

them up.  So it's certainly NOT just a Gentoo thing.  It's a build-your-

35

own-kernel thing, regardless of the distro.

36

37

The problem ultimately boiled down to having to understand IPTables itself 

38

well enough to know what kernel options to enable, either built-in or as 

39

modules which would then need to be loaded.  But if I were to do that, why 

40

would I need the so-called "easier" tool, that only complicated things.  

41

Honestly, the tools made me feel like I was trying to remote-operate some 

42

NASA probe from half-way-across-the-solar-system, latency and all, instead 

43

of using the direct-drive, since what I was operating on was actually 

44

right there next to me!

45

46

At that time I simply punted.  I had (or could have and did have, by 

47

(wise) choice on MS) a NAPT based router between me and the net anyway, 

48

and already knew how to configure /it/.  So I just kept it and ran the 

49

computer itself without a firewall for a number of years.  Several years 

50

later, after switching to Gentoo, when I was quite comfortable on Linux in 

51

general, I /did/ actually learn netfilter/iptables, configure my computer 

52

firewall accordingly, and direct-connect for a year or two -- until my 

53

local config changed and I actually had the need for a NAPT device as I 

54

had multiple local devices to connect to the net.

55

56

Which brings up a nice point about Gentoo.  With Mandrake (and most other 

57

distributions of the era, from what I read), there were enough ports open 

58

by default that having a firewall of /some/ sort, either on-lan NAPT 

59

device or well configured on-computer IPChains/IPTables based, was wise.  

60

IOW, keeping that NAPT device was a good choice, even if it /was/ an MS-

61

based view of things, because the Linux distros of the time still ran with 

62

various open ports (whether they still do or not I don't know, I suspect 

63

most do, tho they probably do it with an IPTables firewall up now too).

64

65

Gentoo's policy by contrast has always (well, since before early 2004, 

66

when I switched to it) been:

67

68

1) Just because it's installed does NOT mean it should have its initscript 

69

activated so it runs automatically in the default runlevel -- Gentoo ships 

70

by default with the initscripts for net-active services in /etc/init.d, 

71

but does NOT automatically add them to the default runlevel.

72

73

2) Even when a net-active service IS activated, Gentoo's default 

74

configuration normally has it active on the loopback localhost address 

75

only.

76

77

3) Gentoo ships X itself with IP-forwarding disabled, only the local Unix 

78

domain socket active.

79

80

As such, by the time I actually got around to learning IPTables/netfilter 

81

and setting it up on my Gentoo box, it really wasn't as necessary as it 

82

would be on other distributions, anyway, because firewall or no firewall, 

83

the only open ports were ports I had deliberately opened myself and thus 

84

already knew about.

85

86

But of course defense in depth is a VERY important security principle, 

87

correlating as it does with the parallel "never trust yourself not to fat-

88

finger SOMETHING!"  (Now, if the so-called security services HBGary, et. 

89

al., only practiced it! ...  I think that's what galled most of the world 

90

most, not that they screwed up a couple things so badly, but that they so 

91

blatantly violated the basic defense-in-depth, or we'd have never read 

92

about the screw-ups in the first place as they'd have not amounted to 

93

anything if the proper layers of defense had been there... and for a 

94

SECURITY firm, no less, to so utterly and completely miss it!)  So 

95

regardless of the fact that in theory I didn't actually need the firewall 

96

by then since the only open ports were the ones I intended to be open, I 

97

wasn't going to run direct-connected without /some/ sort of firewall, and 

98

I learned and activated IPTables/netfilter before I did direct-connect.  

99

And now that I have NAPT again, I still keep it running, as that's simply 

100

another layer of that defense in depth, and I can use the NAPT router for 

101

multiplexing several devices on a single IP, not its originally accidental 

102

side-effect of inbound firewalling, tho again, I keep that too as it's 

103

another layer of that defense in depth, I just don't /count/ on it.

104

105

>> Bottom line, yeah I believe ext4 is safe, but ext3 or ext4, unless you

106

>> really do /not/ care about your data integrity or are going to the

107

>> extreme and already have data=journal, DEFINITELY specify data=ordered,

108

>> both in your mount options, and by setting the defaults via tune2fs.

109

>

110

> So does this turn off journaling?  What's a good reference on the

111

> advantages of ext4 over ext3, or can you just summarize them for me?

112

113

No, this doesn't turn off journaling.

114

115

Briefly...

116

117

There's the actual data, the stuff in the files we care about, and 

118

metadata, the stuff the filesystem tracks behind the scenes so we don't 

119

have to worry about it.  Metadata includes stuff like the filename, the 

120

dates (create/modify/access, the latter of which isn't used that much any 

121

more and is often disabled), permissions (both traditional *ix set*/user/

122

group/world and if active SELinux perms, etc), INODE AND DIRECTORY TABLES 

123

(most important in this context, thus the CAPS, as without them, your data 

124

is effectively reduced to semi-random binary sequences), etc.

125

126

It's the metadata, in particular, the inode and directory tables, that fsck 

127

concerns itself with, that's potentially damaged in the event of a chaotic 

128

shutdown, that fsck checks and tries to restore on remount after such a 

129

shutdown, etc.

130

131

Because the original purpose of journaling was to shortcut the long fscks 

132

after a chaotic shutdown, traditionally it concerns itself only with 

133

metadata.  In practice, however, due to reordered disk operations at both 

134

the OS and disk hardware/firmware level, the result of a recovery with 

135

strict meta-data-only journaling on a filesystem can be perfectly restored 

136

filesystem metadata, but with incorrect real DATA in those files, because 

137

the metadata was already written to disk but the data itself hadn't been, 

138

at the time of the chaotic shutdown.

139

140

Due to important security implications (it's possible that the previous 

141

contents of that inode was an unlinked but not secure-erased file 

142

belonging to another user, UNAUTHORIZED DATA LEAK!!!), such restored 

143

metadata-only files where the data itself is questionable, are normally 

144

truncated to zero-length, thus the post-restore zero-length "empty" file 

145

phenomenon common with early journaled filesystems and still occasionally 

146

seen today.

147

148

The data= journaling option controls data/metadata handling.

149

150

data=writeback is "bare" metadata journaling.  It's the fastest but 

151

riskiest in terms of real data integrity for the reasons explained above.  

152

As such, it's often used where performance matters more than strict data 

153

integrity in the event of chaotic shutdown -- where data is backed up and 

154

changes since the backup tend to be trivial and/or easy to recover, where 

155

the data's easily redownloaded from the net (think the gentoo packages 

156

tree, source tarballs, etc), and/or where the filesystem is wiped at boot 

157

anyway (as /tmp is in many installations/).  Zeroed out files on recovery 

158

can and do happen in writeback mode.

159

160

data=ordered is the middle ground, "good enough" for most people, both in 

161

performance and in data integrity.  The system ensures that the commit of 

162

the real data itself is "ordered" before the metadata that indexes it, 

163

telling the filesystem where it's located.  This comes at a slight 

164

performance cost as some write-order-optimization must be skipped, but it 

165

GREATLY enhances the integrity of the data in the event of a chaotic 

166

shutdown and subsequent recovery.  There are corner-cases where it's still 

167

possible at least in theory to get the wrong behavior, but in practice, 

168

these don't happen very often, and when they do, the loss tends to be that 

169

of reverting to the pre-update version of the file, losing only the 

170

current edit, rather than zeroing out of the file (or worse yet, data 

171

leakage) entirely.

172

173

data=journal is the paranoid option.  With this you'll want a much larger 

174

journal, because not only the metadata, but the data itself, is 

175

journaled.  (And here most people thought that's what journaling did /all/ 

176

the time!)  Because ALL data is ultimately written TWICE in this mode, 

177

first to the journal and then from there to its ultimate location, by 

178

definition it's a factor of two slower, but provided the hardware is 

179

working correctly, the worst-case in a chaotic shutdown is loss of the 

180

current edit, reverting to the previous edition of the file.

181

182

FWIW and rather ironically, my original understanding of all this came 

183

from a series of IBM DeveloperWorks articles written in the early kernel 

184

2.4 series era, explaining the main filesystem choices, many of them then 

185

new, available in kernel 2.4.  While the performance data and some 

186

filesystem implementation detail (plus lack of mention of ext4 and btrfs 

187

as this was before their time) is now somewhat dated, the theory and 

188

general filesystem descriptions remain solid, and as such, the series 

189

remains a reasonably good intro to Linux filesystems to this day.  As 

190

such, parts of it are still available as linked from the Gentoo 

191

Documentation archived copy of those IBM DeveloperWorks articles.  In 

192

particular, two parts covering ext3 and the data= options remain available:

193

194

http://www.gentoo.org/doc/en/articles/afig-ct-ext3-intro.xml

195

http://www.gentoo.org/doc/en/articles/l-afig-p8.xml

196

197

The ironic bit is who the author was, one Daniel Robbins, the same DRobbins 

198

who founded the then Enoch Linux, now Gentoo.  But I read them long before 

199

I ever considered Gentoo, when I was first switching to Linux and using 

200

Mandrake.  It was thus with quite some amazement a number of years later, 

201

after I'd been on Gentoo for awhile, that I discovered that the *SAME* 

202

DRobbins who founded Gentoo (and was still active tho on his way out in 

203

early 2004 when I started on Gentoo), was the guy who wrote the Advanced 

204

Filesystem Implementor's Guide in IBM DeveloperWorks, the guide I'd found 

205

so *INCREDIBLY* helpful years before, when I hadn't a /clue/ who he was or 

206

what distribution I'd chose years later, as I just starting with Mandrake 

207

and trying to figure out what filesystems to choose.

208

209

As to the ext3/ext4 differences... AFAIK the (second) biggest one is that 

210

ext4 uses extents by default, thus fragmenting files somewhat less over 

211

time.  (Extents are a subject worth their own post, which I won't attempt 

212

as while I understand the basics I don't understand all the implications 

213

thereof myself.  But one effect is better efficiency in filesystem layout, 

214

when the filesystem was created with them anyway... it won't help old 

215

files on upgraded-to-ext4-from ext2/3 that much.  Google's available for 

216

more. =:^)

217

218

There's a lot of smaller improvements as well.  ext4 is native large-

219

filesystem by default.  A number of optimizations discovered since ext3 

220

are implemented in ext4 that can't be in ext3 for stability and/or old-

221

kernel backward compatibility reasons.  ext4 has a no-journal option 

222

that's far better on flash-based thumb-drives, etc.  There are a number of 

223

options that can make it better on SSDs and flash in general than ext3.

224

225

And the biggest advantage is that ext4 is actively supported in the kernel 

226

and supports ext2/3 as well, while ext2/3, as separate buildable kernel 

227

options, are definitely considered legacy, with talk, as I believe I 

228

mentioned, of removing them as separate implementations entirely, relying 

229

on ext4's backward compatibility for ext2/3 support.  In that regard, ext3 

230

as a separate option is in worse shape than reiserfs, since it's clearly 

231

legacy and targeted for removal.  As part of ext4, support will 

232

*DEFINITELY* continue for YEARS, more likely DECADES, so is in no danger 

233

in that regard (more so than reiserfs support, which will continue to be 

234

supported as well for at least years), but the focus is definitely on ext4 

235

now, and as ext3 becomes more and more legacy, the chances of corner-case 

236

bugs appearing in ext3-only code in the ext4 driver do logically 

237

increase.  In that regard, reiserfs could actually be argued to be in 

238

better shape, since it's not implemented as a now out-of-focus older-

239

brother to a current filesystem, so while it has less focus in general, it 

240

also has less chances of being accidentally affected by a change to the 

241

current-focus code.

242

243

Which can be argued to have already happened with the default ext3 

244

switching to data=writeback for a number of kernels, before being switched 

245

back to the data=ordered it always had before.  A number of kernels ago 

246

(2.6.29 IIRC), ext4 was either officially just out of or being discussed 

247

for bringing out of experimental.  I believe it was Ubuntu that first made 

248

it a rootfs system install option, in that same time period.  Shortly 

249

thereafter, a whole slew of Ubuntu on ext4 users, most of whom it turned 

250

out later were using the closed nVidia driver, which was unstable in that 

251

version against that Ubuntu version and kernel, thus provoking many cases 

252

of "chaotic shutdown", a classic worst-case trial-by-fire test for the 

253

then still coming out of experimental ext4, began experiencing the classic 

254

"zeroed out file" problems on reboot after their chaotic shutdowns.

255

256

*Greatly* compounding the problem were some seriously ill-advised Gnome 

257

config-file behaviors.  Apparently, they were opening config-files for 

258

read-write simply to READ them and get the config in the process of 

259

initializing GNOME.  Of course, the unstable nVidia driver was 

260

initializing in parallel to all this, with the predictable-in-hindsight 

261

results...  As gnome was only READING the config values, it SHOULD have 

262

opened those files READ-ONLY, if necessary later opening them read-write 

263

to write new values to them.  As with the security defense-in-depth 

264

mentioned in the HBGary parenthetical above, this is pretty basic 

265

filesystem principles, but the gnome folks had it wrong.  The were opening 

266

the files read/write when they only needed read, and the system was 

267

crashing with them in that state.  As a result, these files were open for 

268

writing in the crash, and as is standard security practice as explained 

269

above, the ext4 journaling system, defaulting to write-back mode, restored 

270

them as zeroed out files to prevent any possibility of data leak.  

271

Actually, there were a few other technicalities involved as well (file 

272

renaming on write, failure to call fsync, due in part to ext3's historic 

273

bad behavior on fsync, which it treated as whole-filesystem-sync, etc), 

274

but that's the gist of things.

275

276

So due to ext4's data=writeback and the immaturity of the filesystem such 

277

that it didn't take additional precautions, these folks were getting 

278

critical parts of their gnome config zeroed out every time they crashed, 

279

and due to the unstable nVidia drivers, they were crashing frequently!!

280

281

*NOT* a good situation, and that's a classic understatement!!

282

283

The resulting investigation discovered not only the obvious gnome problem, 

284

but several code tweaks that could be done to ext4 to reduce the 

285

likelihood of this sort of situation in the future.

286

287

All fine and good, so far.  But they quickly realized that the same sort 

288

of code tweak issues existed with ext3, except that because ext3 defaulted 

289

to data=ordered, only those specifically setting data=writeback were 

290

having problems, and because those using data=writeback were expected to 

291

have /some/ problems anyway, the issues had been attributed to that and 

292

thus hadn't been fully investigated and fixed, all these years.

293

294

So they fixed the problems in ext3 as well.  Again, all fine and good -- 

295

the problems NEEDED fixed.  *BUT*, and here's where the controversy comes 

296

in, they decided that data=writeback was now dependable enough for BOTH 

297

ext3 and ext4, thus changing the default for ext3.

298

299

To say that was hugely controversial is an understatement (multiple 

300

threads on LKML, LWN, elsewhere where the issue was covered at the time, 

301

often several hundreds of posts long each), and my feelings on 

302

data=writeback should be transparent by now so where I stand on the issue 

303

should be equally transparent, but Linus never-the-less merged the commit 

304

that switched ext3 to data=writeback by default, AFAIK in 2.6.31.  (AFAIK, 

305

they discovered the problem in 2.6.29, 2.6.30 contained temporary work-

306

around-fixes, 2.6.31 contained the permanent fixes and switched ext3 to 

307

data=writeback.)

308

309

Here's the critical point.  Because reiserfs isn't so closely related to 

310

the ext* family, it retained the data=ordered default it had gotten years 

311

early, the same kernel Chris Mason committed the code for reiserfs to do 

312

data=ordered at all.  ext3 got the change due to its relationship with 

313

ext4, despite the fact that it's officially an old and stable filesystem 

314

where arguably such major policy changes should not occur.  If the seperate 

315

kernel option for ext3 is removed in ordered to remove the duplicate 

316

functionality already included in ext4 for backward compatibility reasons, 

317

by definition, this sort of change to ext4 *WILL* change the ext3 it also 

318

supports, unless deliberate action is taken to avoid it.  That makes such 

319

issues far more likely to occur again in ext3, than in the relatively 

320

obscure ext4.

321

322

Meanwhile, as mentioned, with newer kernels (2.6.36, 37, or 38, IDR which, 

323

tho it won't matter for those specifying the data=option either via 

324

filesystem defaults using tune2fs, or via specific mount option), ext3 

325

reverted again to the older and safer default, data=ordered.

326

327

And as I said, it's my firm opinion that the data= option has a stronger 

328

effect on filesystem stability than any possibly remaining issues with 

329

ext4, which is really quite stable by now.  Thus, ext3, ext4, or reiserfs, 

330

I'd **STRONGLY** recommend data=ordered, regardless of whether it's the 

331

default as it is with old and new (but with a gap) ext3 and reiserfs as it 

332

has been for years, or not, as I believe ext4 still defaults to 

333

data=writeback.  If you value your data, "just do it!"

334

335

Meanwhile, I believe the default on the definitely still experimental 

336

btrfs is data=writeback too.  While I plan on switching to it eventually, 

337

you can be quite sure I'll be examining that default and as of this point, 

338

have no intentions of letting it be data=writeback, when I do.

339

340

....

341

342

> The problem with Gentoo was that because EVMS was an orphaned project, I

343

> believe the ebuild wasn't updated.  The initrd file was specific for

344

> EVMS.

345

346

That's quite likely, indeed.

347

348

> Of course.  I like technology that _lasts_!  We have a clock in our

349

> house that's about 190 years old [...] turned me on to the Connecticut

350

> Clock and Watch museum, run by one George Bruno [who] also makes working

351

> replicas [and] was able to send me an exact replacement part!  Try

352

> _THAT_ with your 1990's era computer ;-)

353

354

That reminds me...  I skipped it as irrelevant to the topic at hand, but 

355

due to kernel sensors and ACPI changes, I decided to try the last BIOS 

356

upgrade available for this Tyan, after having run an earlier BIOS for some 

357

years.  Along about 2.6.27, I had to start using a special boot parameter 

358

to keep the sensors working, as apparently the sensor address regions 

359

overlap ACPI address regions (not an uncommon issue in boards of that era, 

360

the kernel folks say).  The comments on the kernel bug I filed suggested 

361

that a BIOS update might straighten that out (it didn't, BIOS still too 

362

old and board EOLed, even if it is still working), so I decided to try it.

363

364

The problem was that I had a bad memory stick.  Now the kernel has 

365

detectors for that and I had them active, but the kernel drivers for that 

366

were introduced long after I got the hardware, and while it was logging an 

367

issue with the memory, since it had been doing that since I activated the 

368

kernel drivers for it, I misinterpreted that as simply how it worked, so 

369

wasn't aware of the bad memory it was trying to tell me about.

370

371

So I booted to the FreeDOS floppy I used for BIOS upgrades (I've used 

372

FreeDOS for BIOS upgrades for years, without incident before this) and 

373

began the process.

374

375

It crashed half-way thru the flash-burn, apparently when it hit that bad 

376

memory!!

377

378

Bad situation, but there's supposed to be a failsafe direct-read-recover 

379

mode built-in, that probably would have worked had I known about it.  

380

Unfortunately I didn't, and by the time I figured it out, I'd screwed that 

381

up as well.

382

383

But luckily I have a netbook, that I had intended to put Gentoo on but had 

384

never gotten around to at that point (tho it's running Gentoo now, 2.6.38 

385

kernel, kde 4.6.1, fully updated as of mid-March).  It was still running 

386

the Linpus Linux it shipped with (first full system I've bought since my 

387

original 486SX25 w/ 2MB memory and 130 MB hard drive in 1993, or so, and 

388

I'd have sooner done without the netbook than pay the MS tax, I DID have 

389

to order it from Canada and have it shipped to the US).  I was able to get 

390

online with that, grab a yahoo webmail account since my mail logins were 

391

stuck on the main system without a BIOS, and use that to order a new BIOS 

392

chip shipped to me, the target BIOS pre-installed.

393

394

That new BIOS chip rescued my system!

395

396

I suspect my feelings after that BIOS chip did the trick rather mirror 

397

yours after that gear did the trick for your clock.  The computer might 

398

not be 190 years old, but 2003 is old enough in computer years, and I 

399

suspect I have rather more of my life wound up in that computer than you 

400

do in that clock, 190 years old or not.

401

402

Regardless, tho, you'll surely agree,

403

404

WHAT A RELIEF TO SEE IT RUNNING AGAIN!  =:^)

405

406

--

407

Duncan - List replies preferred.   No HTML msgs.

408

"Every nonfree program has a lord, a master --

409

and if you use the program, he is your master."  Richard Stallman

Gentoo Archives: gentoo-desktop