Gentoo Archives: gentoo-desktop

From: Duncan <1i5t5.duncan@×××.net>
To: gentoo-desktop@l.g.o
Subject: [gentoo-desktop] Re: System problems - some progress
Date: Fri, 01 Apr 2011 03:24:21
Message-Id: pan.2011.04.01.03.22.06@cox.net
In Reply to: Re: [gentoo-desktop] Re: System problems - some progress by Lindsay Haisley
1 Lindsay Haisley posted on Sat, 26 Mar 2011 10:57:33 -0500 as excerpted:
2
3 > Yep, I know where you're coming from there. Iptables isn't all that
4 > hard to understand, and I've become pretty conversant with it in the
5 > process of using for my own and others' systems. I'd always rather deal
6 > with the "under the hood" CLI tools than with some GUI tool that does
7 > little more than obfuscate the real issue. That way lies Windows!
8
9 Indeed, the MSWindows way is the GUI way. But I wasn't even thinking
10 about that. I was thinking about the so-called "easier" firewalling CLI/
11 text-editing tools that have you initially answer a number of questions to
12 setup the basics, then have you edit files to do any "advanced" tweaking
13 the questions didn't have the foresight to cover.
14
15 But my (first) problem was that while I could answer the questions easy
16 enough, I lacked sufficient understanding of the real implementation to
17 properly do the advanced editing. And if I were to properly dig into
18 that, I might as well have mastered the IPTables/Netfilter stuff on which
19 it was ultimately based in the first place.
20
21 The other problem, when building your own kernel, was that the so-called
22 simpler tools apparently expect all the necessary Netfilter/IPTable kernel
23 options to be available as pre-built modules (or built-in) -- IOW, they're
24 designed for the binary distributions where that's the case. Neither the
25 questions nor the underlying config file comments mentioned their kernel
26 module dependencies. One either had to pre-build them all and hope they
27 either got auto-loaded as needed, or delve into the scripts to figure out
28 the dependencies and build/load the required modules.
29
30 Now keep in mind that I first tried this on Mandrake, where I was building
31 my own kernel within 90 days of first undertaking the switch, while I was
32 still booting to MS to do mail and news in MSOE, because I hadn't yet had
33 time to look at user level apps well enough to make my choices and set
34 them up. So it's certainly NOT just a Gentoo thing. It's a build-your-
35 own-kernel thing, regardless of the distro.
36
37 The problem ultimately boiled down to having to understand IPTables itself
38 well enough to know what kernel options to enable, either built-in or as
39 modules which would then need to be loaded. But if I were to do that, why
40 would I need the so-called "easier" tool, that only complicated things.
41 Honestly, the tools made me feel like I was trying to remote-operate some
42 NASA probe from half-way-across-the-solar-system, latency and all, instead
43 of using the direct-drive, since what I was operating on was actually
44 right there next to me!
45
46 At that time I simply punted. I had (or could have and did have, by
47 (wise) choice on MS) a NAPT based router between me and the net anyway,
48 and already knew how to configure /it/. So I just kept it and ran the
49 computer itself without a firewall for a number of years. Several years
50 later, after switching to Gentoo, when I was quite comfortable on Linux in
51 general, I /did/ actually learn netfilter/iptables, configure my computer
52 firewall accordingly, and direct-connect for a year or two -- until my
53 local config changed and I actually had the need for a NAPT device as I
54 had multiple local devices to connect to the net.
55
56 Which brings up a nice point about Gentoo. With Mandrake (and most other
57 distributions of the era, from what I read), there were enough ports open
58 by default that having a firewall of /some/ sort, either on-lan NAPT
59 device or well configured on-computer IPChains/IPTables based, was wise.
60 IOW, keeping that NAPT device was a good choice, even if it /was/ an MS-
61 based view of things, because the Linux distros of the time still ran with
62 various open ports (whether they still do or not I don't know, I suspect
63 most do, tho they probably do it with an IPTables firewall up now too).
64
65 Gentoo's policy by contrast has always (well, since before early 2004,
66 when I switched to it) been:
67
68 1) Just because it's installed does NOT mean it should have its initscript
69 activated so it runs automatically in the default runlevel -- Gentoo ships
70 by default with the initscripts for net-active services in /etc/init.d,
71 but does NOT automatically add them to the default runlevel.
72
73 2) Even when a net-active service IS activated, Gentoo's default
74 configuration normally has it active on the loopback localhost address
75 only.
76
77 3) Gentoo ships X itself with IP-forwarding disabled, only the local Unix
78 domain socket active.
79
80 As such, by the time I actually got around to learning IPTables/netfilter
81 and setting it up on my Gentoo box, it really wasn't as necessary as it
82 would be on other distributions, anyway, because firewall or no firewall,
83 the only open ports were ports I had deliberately opened myself and thus
84 already knew about.
85
86 But of course defense in depth is a VERY important security principle,
87 correlating as it does with the parallel "never trust yourself not to fat-
88 finger SOMETHING!" (Now, if the so-called security services HBGary, et.
89 al., only practiced it! ... I think that's what galled most of the world
90 most, not that they screwed up a couple things so badly, but that they so
91 blatantly violated the basic defense-in-depth, or we'd have never read
92 about the screw-ups in the first place as they'd have not amounted to
93 anything if the proper layers of defense had been there... and for a
94 SECURITY firm, no less, to so utterly and completely miss it!) So
95 regardless of the fact that in theory I didn't actually need the firewall
96 by then since the only open ports were the ones I intended to be open, I
97 wasn't going to run direct-connected without /some/ sort of firewall, and
98 I learned and activated IPTables/netfilter before I did direct-connect.
99 And now that I have NAPT again, I still keep it running, as that's simply
100 another layer of that defense in depth, and I can use the NAPT router for
101 multiplexing several devices on a single IP, not its originally accidental
102 side-effect of inbound firewalling, tho again, I keep that too as it's
103 another layer of that defense in depth, I just don't /count/ on it.
104
105 >> Bottom line, yeah I believe ext4 is safe, but ext3 or ext4, unless you
106 >> really do /not/ care about your data integrity or are going to the
107 >> extreme and already have data=journal, DEFINITELY specify data=ordered,
108 >> both in your mount options, and by setting the defaults via tune2fs.
109 >
110 > So does this turn off journaling? What's a good reference on the
111 > advantages of ext4 over ext3, or can you just summarize them for me?
112
113 No, this doesn't turn off journaling.
114
115 Briefly...
116
117 There's the actual data, the stuff in the files we care about, and
118 metadata, the stuff the filesystem tracks behind the scenes so we don't
119 have to worry about it. Metadata includes stuff like the filename, the
120 dates (create/modify/access, the latter of which isn't used that much any
121 more and is often disabled), permissions (both traditional *ix set*/user/
122 group/world and if active SELinux perms, etc), INODE AND DIRECTORY TABLES
123 (most important in this context, thus the CAPS, as without them, your data
124 is effectively reduced to semi-random binary sequences), etc.
125
126 It's the metadata, in particular, the inode and directory tables, that fsck
127 concerns itself with, that's potentially damaged in the event of a chaotic
128 shutdown, that fsck checks and tries to restore on remount after such a
129 shutdown, etc.
130
131 Because the original purpose of journaling was to shortcut the long fscks
132 after a chaotic shutdown, traditionally it concerns itself only with
133 metadata. In practice, however, due to reordered disk operations at both
134 the OS and disk hardware/firmware level, the result of a recovery with
135 strict meta-data-only journaling on a filesystem can be perfectly restored
136 filesystem metadata, but with incorrect real DATA in those files, because
137 the metadata was already written to disk but the data itself hadn't been,
138 at the time of the chaotic shutdown.
139
140 Due to important security implications (it's possible that the previous
141 contents of that inode was an unlinked but not secure-erased file
142 belonging to another user, UNAUTHORIZED DATA LEAK!!!), such restored
143 metadata-only files where the data itself is questionable, are normally
144 truncated to zero-length, thus the post-restore zero-length "empty" file
145 phenomenon common with early journaled filesystems and still occasionally
146 seen today.
147
148 The data= journaling option controls data/metadata handling.
149
150 data=writeback is "bare" metadata journaling. It's the fastest but
151 riskiest in terms of real data integrity for the reasons explained above.
152 As such, it's often used where performance matters more than strict data
153 integrity in the event of chaotic shutdown -- where data is backed up and
154 changes since the backup tend to be trivial and/or easy to recover, where
155 the data's easily redownloaded from the net (think the gentoo packages
156 tree, source tarballs, etc), and/or where the filesystem is wiped at boot
157 anyway (as /tmp is in many installations/). Zeroed out files on recovery
158 can and do happen in writeback mode.
159
160 data=ordered is the middle ground, "good enough" for most people, both in
161 performance and in data integrity. The system ensures that the commit of
162 the real data itself is "ordered" before the metadata that indexes it,
163 telling the filesystem where it's located. This comes at a slight
164 performance cost as some write-order-optimization must be skipped, but it
165 GREATLY enhances the integrity of the data in the event of a chaotic
166 shutdown and subsequent recovery. There are corner-cases where it's still
167 possible at least in theory to get the wrong behavior, but in practice,
168 these don't happen very often, and when they do, the loss tends to be that
169 of reverting to the pre-update version of the file, losing only the
170 current edit, rather than zeroing out of the file (or worse yet, data
171 leakage) entirely.
172
173 data=journal is the paranoid option. With this you'll want a much larger
174 journal, because not only the metadata, but the data itself, is
175 journaled. (And here most people thought that's what journaling did /all/
176 the time!) Because ALL data is ultimately written TWICE in this mode,
177 first to the journal and then from there to its ultimate location, by
178 definition it's a factor of two slower, but provided the hardware is
179 working correctly, the worst-case in a chaotic shutdown is loss of the
180 current edit, reverting to the previous edition of the file.
181
182 FWIW and rather ironically, my original understanding of all this came
183 from a series of IBM DeveloperWorks articles written in the early kernel
184 2.4 series era, explaining the main filesystem choices, many of them then
185 new, available in kernel 2.4. While the performance data and some
186 filesystem implementation detail (plus lack of mention of ext4 and btrfs
187 as this was before their time) is now somewhat dated, the theory and
188 general filesystem descriptions remain solid, and as such, the series
189 remains a reasonably good intro to Linux filesystems to this day. As
190 such, parts of it are still available as linked from the Gentoo
191 Documentation archived copy of those IBM DeveloperWorks articles. In
192 particular, two parts covering ext3 and the data= options remain available:
193
194 http://www.gentoo.org/doc/en/articles/afig-ct-ext3-intro.xml
195 http://www.gentoo.org/doc/en/articles/l-afig-p8.xml
196
197 The ironic bit is who the author was, one Daniel Robbins, the same DRobbins
198 who founded the then Enoch Linux, now Gentoo. But I read them long before
199 I ever considered Gentoo, when I was first switching to Linux and using
200 Mandrake. It was thus with quite some amazement a number of years later,
201 after I'd been on Gentoo for awhile, that I discovered that the *SAME*
202 DRobbins who founded Gentoo (and was still active tho on his way out in
203 early 2004 when I started on Gentoo), was the guy who wrote the Advanced
204 Filesystem Implementor's Guide in IBM DeveloperWorks, the guide I'd found
205 so *INCREDIBLY* helpful years before, when I hadn't a /clue/ who he was or
206 what distribution I'd chose years later, as I just starting with Mandrake
207 and trying to figure out what filesystems to choose.
208
209 As to the ext3/ext4 differences... AFAIK the (second) biggest one is that
210 ext4 uses extents by default, thus fragmenting files somewhat less over
211 time. (Extents are a subject worth their own post, which I won't attempt
212 as while I understand the basics I don't understand all the implications
213 thereof myself. But one effect is better efficiency in filesystem layout,
214 when the filesystem was created with them anyway... it won't help old
215 files on upgraded-to-ext4-from ext2/3 that much. Google's available for
216 more. =:^)
217
218 There's a lot of smaller improvements as well. ext4 is native large-
219 filesystem by default. A number of optimizations discovered since ext3
220 are implemented in ext4 that can't be in ext3 for stability and/or old-
221 kernel backward compatibility reasons. ext4 has a no-journal option
222 that's far better on flash-based thumb-drives, etc. There are a number of
223 options that can make it better on SSDs and flash in general than ext3.
224
225 And the biggest advantage is that ext4 is actively supported in the kernel
226 and supports ext2/3 as well, while ext2/3, as separate buildable kernel
227 options, are definitely considered legacy, with talk, as I believe I
228 mentioned, of removing them as separate implementations entirely, relying
229 on ext4's backward compatibility for ext2/3 support. In that regard, ext3
230 as a separate option is in worse shape than reiserfs, since it's clearly
231 legacy and targeted for removal. As part of ext4, support will
232 *DEFINITELY* continue for YEARS, more likely DECADES, so is in no danger
233 in that regard (more so than reiserfs support, which will continue to be
234 supported as well for at least years), but the focus is definitely on ext4
235 now, and as ext3 becomes more and more legacy, the chances of corner-case
236 bugs appearing in ext3-only code in the ext4 driver do logically
237 increase. In that regard, reiserfs could actually be argued to be in
238 better shape, since it's not implemented as a now out-of-focus older-
239 brother to a current filesystem, so while it has less focus in general, it
240 also has less chances of being accidentally affected by a change to the
241 current-focus code.
242
243 Which can be argued to have already happened with the default ext3
244 switching to data=writeback for a number of kernels, before being switched
245 back to the data=ordered it always had before. A number of kernels ago
246 (2.6.29 IIRC), ext4 was either officially just out of or being discussed
247 for bringing out of experimental. I believe it was Ubuntu that first made
248 it a rootfs system install option, in that same time period. Shortly
249 thereafter, a whole slew of Ubuntu on ext4 users, most of whom it turned
250 out later were using the closed nVidia driver, which was unstable in that
251 version against that Ubuntu version and kernel, thus provoking many cases
252 of "chaotic shutdown", a classic worst-case trial-by-fire test for the
253 then still coming out of experimental ext4, began experiencing the classic
254 "zeroed out file" problems on reboot after their chaotic shutdowns.
255
256 *Greatly* compounding the problem were some seriously ill-advised Gnome
257 config-file behaviors. Apparently, they were opening config-files for
258 read-write simply to READ them and get the config in the process of
259 initializing GNOME. Of course, the unstable nVidia driver was
260 initializing in parallel to all this, with the predictable-in-hindsight
261 results... As gnome was only READING the config values, it SHOULD have
262 opened those files READ-ONLY, if necessary later opening them read-write
263 to write new values to them. As with the security defense-in-depth
264 mentioned in the HBGary parenthetical above, this is pretty basic
265 filesystem principles, but the gnome folks had it wrong. The were opening
266 the files read/write when they only needed read, and the system was
267 crashing with them in that state. As a result, these files were open for
268 writing in the crash, and as is standard security practice as explained
269 above, the ext4 journaling system, defaulting to write-back mode, restored
270 them as zeroed out files to prevent any possibility of data leak.
271 Actually, there were a few other technicalities involved as well (file
272 renaming on write, failure to call fsync, due in part to ext3's historic
273 bad behavior on fsync, which it treated as whole-filesystem-sync, etc),
274 but that's the gist of things.
275
276 So due to ext4's data=writeback and the immaturity of the filesystem such
277 that it didn't take additional precautions, these folks were getting
278 critical parts of their gnome config zeroed out every time they crashed,
279 and due to the unstable nVidia drivers, they were crashing frequently!!
280
281 *NOT* a good situation, and that's a classic understatement!!
282
283 The resulting investigation discovered not only the obvious gnome problem,
284 but several code tweaks that could be done to ext4 to reduce the
285 likelihood of this sort of situation in the future.
286
287 All fine and good, so far. But they quickly realized that the same sort
288 of code tweak issues existed with ext3, except that because ext3 defaulted
289 to data=ordered, only those specifically setting data=writeback were
290 having problems, and because those using data=writeback were expected to
291 have /some/ problems anyway, the issues had been attributed to that and
292 thus hadn't been fully investigated and fixed, all these years.
293
294 So they fixed the problems in ext3 as well. Again, all fine and good --
295 the problems NEEDED fixed. *BUT*, and here's where the controversy comes
296 in, they decided that data=writeback was now dependable enough for BOTH
297 ext3 and ext4, thus changing the default for ext3.
298
299 To say that was hugely controversial is an understatement (multiple
300 threads on LKML, LWN, elsewhere where the issue was covered at the time,
301 often several hundreds of posts long each), and my feelings on
302 data=writeback should be transparent by now so where I stand on the issue
303 should be equally transparent, but Linus never-the-less merged the commit
304 that switched ext3 to data=writeback by default, AFAIK in 2.6.31. (AFAIK,
305 they discovered the problem in 2.6.29, 2.6.30 contained temporary work-
306 around-fixes, 2.6.31 contained the permanent fixes and switched ext3 to
307 data=writeback.)
308
309 Here's the critical point. Because reiserfs isn't so closely related to
310 the ext* family, it retained the data=ordered default it had gotten years
311 early, the same kernel Chris Mason committed the code for reiserfs to do
312 data=ordered at all. ext3 got the change due to its relationship with
313 ext4, despite the fact that it's officially an old and stable filesystem
314 where arguably such major policy changes should not occur. If the seperate
315 kernel option for ext3 is removed in ordered to remove the duplicate
316 functionality already included in ext4 for backward compatibility reasons,
317 by definition, this sort of change to ext4 *WILL* change the ext3 it also
318 supports, unless deliberate action is taken to avoid it. That makes such
319 issues far more likely to occur again in ext3, than in the relatively
320 obscure ext4.
321
322 Meanwhile, as mentioned, with newer kernels (2.6.36, 37, or 38, IDR which,
323 tho it won't matter for those specifying the data=option either via
324 filesystem defaults using tune2fs, or via specific mount option), ext3
325 reverted again to the older and safer default, data=ordered.
326
327 And as I said, it's my firm opinion that the data= option has a stronger
328 effect on filesystem stability than any possibly remaining issues with
329 ext4, which is really quite stable by now. Thus, ext3, ext4, or reiserfs,
330 I'd **STRONGLY** recommend data=ordered, regardless of whether it's the
331 default as it is with old and new (but with a gap) ext3 and reiserfs as it
332 has been for years, or not, as I believe ext4 still defaults to
333 data=writeback. If you value your data, "just do it!"
334
335 Meanwhile, I believe the default on the definitely still experimental
336 btrfs is data=writeback too. While I plan on switching to it eventually,
337 you can be quite sure I'll be examining that default and as of this point,
338 have no intentions of letting it be data=writeback, when I do.
339
340 ....
341
342 > The problem with Gentoo was that because EVMS was an orphaned project, I
343 > believe the ebuild wasn't updated. The initrd file was specific for
344 > EVMS.
345
346 That's quite likely, indeed.
347
348 > Of course. I like technology that _lasts_! We have a clock in our
349 > house that's about 190 years old [...] turned me on to the Connecticut
350 > Clock and Watch museum, run by one George Bruno [who] also makes working
351 > replicas [and] was able to send me an exact replacement part! Try
352 > _THAT_ with your 1990's era computer ;-)
353
354 That reminds me... I skipped it as irrelevant to the topic at hand, but
355 due to kernel sensors and ACPI changes, I decided to try the last BIOS
356 upgrade available for this Tyan, after having run an earlier BIOS for some
357 years. Along about 2.6.27, I had to start using a special boot parameter
358 to keep the sensors working, as apparently the sensor address regions
359 overlap ACPI address regions (not an uncommon issue in boards of that era,
360 the kernel folks say). The comments on the kernel bug I filed suggested
361 that a BIOS update might straighten that out (it didn't, BIOS still too
362 old and board EOLed, even if it is still working), so I decided to try it.
363
364 The problem was that I had a bad memory stick. Now the kernel has
365 detectors for that and I had them active, but the kernel drivers for that
366 were introduced long after I got the hardware, and while it was logging an
367 issue with the memory, since it had been doing that since I activated the
368 kernel drivers for it, I misinterpreted that as simply how it worked, so
369 wasn't aware of the bad memory it was trying to tell me about.
370
371 So I booted to the FreeDOS floppy I used for BIOS upgrades (I've used
372 FreeDOS for BIOS upgrades for years, without incident before this) and
373 began the process.
374
375 It crashed half-way thru the flash-burn, apparently when it hit that bad
376 memory!!
377
378 Bad situation, but there's supposed to be a failsafe direct-read-recover
379 mode built-in, that probably would have worked had I known about it.
380 Unfortunately I didn't, and by the time I figured it out, I'd screwed that
381 up as well.
382
383 But luckily I have a netbook, that I had intended to put Gentoo on but had
384 never gotten around to at that point (tho it's running Gentoo now, 2.6.38
385 kernel, kde 4.6.1, fully updated as of mid-March). It was still running
386 the Linpus Linux it shipped with (first full system I've bought since my
387 original 486SX25 w/ 2MB memory and 130 MB hard drive in 1993, or so, and
388 I'd have sooner done without the netbook than pay the MS tax, I DID have
389 to order it from Canada and have it shipped to the US). I was able to get
390 online with that, grab a yahoo webmail account since my mail logins were
391 stuck on the main system without a BIOS, and use that to order a new BIOS
392 chip shipped to me, the target BIOS pre-installed.
393
394 That new BIOS chip rescued my system!
395
396 I suspect my feelings after that BIOS chip did the trick rather mirror
397 yours after that gear did the trick for your clock. The computer might
398 not be 190 years old, but 2003 is old enough in computer years, and I
399 suspect I have rather more of my life wound up in that computer than you
400 do in that clock, 190 years old or not.
401
402 Regardless, tho, you'll surely agree,
403
404 WHAT A RELIEF TO SEE IT RUNNING AGAIN! =:^)
405
406 --
407 Duncan - List replies preferred. No HTML msgs.
408 "Every nonfree program has a lord, a master --
409 and if you use the program, he is your master." Richard Stallman