1 |
Lindsay Haisley posted on Sat, 26 Mar 2011 10:57:33 -0500 as excerpted: |
2 |
|
3 |
> Yep, I know where you're coming from there. Iptables isn't all that |
4 |
> hard to understand, and I've become pretty conversant with it in the |
5 |
> process of using for my own and others' systems. I'd always rather deal |
6 |
> with the "under the hood" CLI tools than with some GUI tool that does |
7 |
> little more than obfuscate the real issue. That way lies Windows! |
8 |
|
9 |
Indeed, the MSWindows way is the GUI way. But I wasn't even thinking |
10 |
about that. I was thinking about the so-called "easier" firewalling CLI/ |
11 |
text-editing tools that have you initially answer a number of questions to |
12 |
setup the basics, then have you edit files to do any "advanced" tweaking |
13 |
the questions didn't have the foresight to cover. |
14 |
|
15 |
But my (first) problem was that while I could answer the questions easy |
16 |
enough, I lacked sufficient understanding of the real implementation to |
17 |
properly do the advanced editing. And if I were to properly dig into |
18 |
that, I might as well have mastered the IPTables/Netfilter stuff on which |
19 |
it was ultimately based in the first place. |
20 |
|
21 |
The other problem, when building your own kernel, was that the so-called |
22 |
simpler tools apparently expect all the necessary Netfilter/IPTable kernel |
23 |
options to be available as pre-built modules (or built-in) -- IOW, they're |
24 |
designed for the binary distributions where that's the case. Neither the |
25 |
questions nor the underlying config file comments mentioned their kernel |
26 |
module dependencies. One either had to pre-build them all and hope they |
27 |
either got auto-loaded as needed, or delve into the scripts to figure out |
28 |
the dependencies and build/load the required modules. |
29 |
|
30 |
Now keep in mind that I first tried this on Mandrake, where I was building |
31 |
my own kernel within 90 days of first undertaking the switch, while I was |
32 |
still booting to MS to do mail and news in MSOE, because I hadn't yet had |
33 |
time to look at user level apps well enough to make my choices and set |
34 |
them up. So it's certainly NOT just a Gentoo thing. It's a build-your- |
35 |
own-kernel thing, regardless of the distro. |
36 |
|
37 |
The problem ultimately boiled down to having to understand IPTables itself |
38 |
well enough to know what kernel options to enable, either built-in or as |
39 |
modules which would then need to be loaded. But if I were to do that, why |
40 |
would I need the so-called "easier" tool, that only complicated things. |
41 |
Honestly, the tools made me feel like I was trying to remote-operate some |
42 |
NASA probe from half-way-across-the-solar-system, latency and all, instead |
43 |
of using the direct-drive, since what I was operating on was actually |
44 |
right there next to me! |
45 |
|
46 |
At that time I simply punted. I had (or could have and did have, by |
47 |
(wise) choice on MS) a NAPT based router between me and the net anyway, |
48 |
and already knew how to configure /it/. So I just kept it and ran the |
49 |
computer itself without a firewall for a number of years. Several years |
50 |
later, after switching to Gentoo, when I was quite comfortable on Linux in |
51 |
general, I /did/ actually learn netfilter/iptables, configure my computer |
52 |
firewall accordingly, and direct-connect for a year or two -- until my |
53 |
local config changed and I actually had the need for a NAPT device as I |
54 |
had multiple local devices to connect to the net. |
55 |
|
56 |
Which brings up a nice point about Gentoo. With Mandrake (and most other |
57 |
distributions of the era, from what I read), there were enough ports open |
58 |
by default that having a firewall of /some/ sort, either on-lan NAPT |
59 |
device or well configured on-computer IPChains/IPTables based, was wise. |
60 |
IOW, keeping that NAPT device was a good choice, even if it /was/ an MS- |
61 |
based view of things, because the Linux distros of the time still ran with |
62 |
various open ports (whether they still do or not I don't know, I suspect |
63 |
most do, tho they probably do it with an IPTables firewall up now too). |
64 |
|
65 |
Gentoo's policy by contrast has always (well, since before early 2004, |
66 |
when I switched to it) been: |
67 |
|
68 |
1) Just because it's installed does NOT mean it should have its initscript |
69 |
activated so it runs automatically in the default runlevel -- Gentoo ships |
70 |
by default with the initscripts for net-active services in /etc/init.d, |
71 |
but does NOT automatically add them to the default runlevel. |
72 |
|
73 |
2) Even when a net-active service IS activated, Gentoo's default |
74 |
configuration normally has it active on the loopback localhost address |
75 |
only. |
76 |
|
77 |
3) Gentoo ships X itself with IP-forwarding disabled, only the local Unix |
78 |
domain socket active. |
79 |
|
80 |
As such, by the time I actually got around to learning IPTables/netfilter |
81 |
and setting it up on my Gentoo box, it really wasn't as necessary as it |
82 |
would be on other distributions, anyway, because firewall or no firewall, |
83 |
the only open ports were ports I had deliberately opened myself and thus |
84 |
already knew about. |
85 |
|
86 |
But of course defense in depth is a VERY important security principle, |
87 |
correlating as it does with the parallel "never trust yourself not to fat- |
88 |
finger SOMETHING!" (Now, if the so-called security services HBGary, et. |
89 |
al., only practiced it! ... I think that's what galled most of the world |
90 |
most, not that they screwed up a couple things so badly, but that they so |
91 |
blatantly violated the basic defense-in-depth, or we'd have never read |
92 |
about the screw-ups in the first place as they'd have not amounted to |
93 |
anything if the proper layers of defense had been there... and for a |
94 |
SECURITY firm, no less, to so utterly and completely miss it!) So |
95 |
regardless of the fact that in theory I didn't actually need the firewall |
96 |
by then since the only open ports were the ones I intended to be open, I |
97 |
wasn't going to run direct-connected without /some/ sort of firewall, and |
98 |
I learned and activated IPTables/netfilter before I did direct-connect. |
99 |
And now that I have NAPT again, I still keep it running, as that's simply |
100 |
another layer of that defense in depth, and I can use the NAPT router for |
101 |
multiplexing several devices on a single IP, not its originally accidental |
102 |
side-effect of inbound firewalling, tho again, I keep that too as it's |
103 |
another layer of that defense in depth, I just don't /count/ on it. |
104 |
|
105 |
>> Bottom line, yeah I believe ext4 is safe, but ext3 or ext4, unless you |
106 |
>> really do /not/ care about your data integrity or are going to the |
107 |
>> extreme and already have data=journal, DEFINITELY specify data=ordered, |
108 |
>> both in your mount options, and by setting the defaults via tune2fs. |
109 |
> |
110 |
> So does this turn off journaling? What's a good reference on the |
111 |
> advantages of ext4 over ext3, or can you just summarize them for me? |
112 |
|
113 |
No, this doesn't turn off journaling. |
114 |
|
115 |
Briefly... |
116 |
|
117 |
There's the actual data, the stuff in the files we care about, and |
118 |
metadata, the stuff the filesystem tracks behind the scenes so we don't |
119 |
have to worry about it. Metadata includes stuff like the filename, the |
120 |
dates (create/modify/access, the latter of which isn't used that much any |
121 |
more and is often disabled), permissions (both traditional *ix set*/user/ |
122 |
group/world and if active SELinux perms, etc), INODE AND DIRECTORY TABLES |
123 |
(most important in this context, thus the CAPS, as without them, your data |
124 |
is effectively reduced to semi-random binary sequences), etc. |
125 |
|
126 |
It's the metadata, in particular, the inode and directory tables, that fsck |
127 |
concerns itself with, that's potentially damaged in the event of a chaotic |
128 |
shutdown, that fsck checks and tries to restore on remount after such a |
129 |
shutdown, etc. |
130 |
|
131 |
Because the original purpose of journaling was to shortcut the long fscks |
132 |
after a chaotic shutdown, traditionally it concerns itself only with |
133 |
metadata. In practice, however, due to reordered disk operations at both |
134 |
the OS and disk hardware/firmware level, the result of a recovery with |
135 |
strict meta-data-only journaling on a filesystem can be perfectly restored |
136 |
filesystem metadata, but with incorrect real DATA in those files, because |
137 |
the metadata was already written to disk but the data itself hadn't been, |
138 |
at the time of the chaotic shutdown. |
139 |
|
140 |
Due to important security implications (it's possible that the previous |
141 |
contents of that inode was an unlinked but not secure-erased file |
142 |
belonging to another user, UNAUTHORIZED DATA LEAK!!!), such restored |
143 |
metadata-only files where the data itself is questionable, are normally |
144 |
truncated to zero-length, thus the post-restore zero-length "empty" file |
145 |
phenomenon common with early journaled filesystems and still occasionally |
146 |
seen today. |
147 |
|
148 |
The data= journaling option controls data/metadata handling. |
149 |
|
150 |
data=writeback is "bare" metadata journaling. It's the fastest but |
151 |
riskiest in terms of real data integrity for the reasons explained above. |
152 |
As such, it's often used where performance matters more than strict data |
153 |
integrity in the event of chaotic shutdown -- where data is backed up and |
154 |
changes since the backup tend to be trivial and/or easy to recover, where |
155 |
the data's easily redownloaded from the net (think the gentoo packages |
156 |
tree, source tarballs, etc), and/or where the filesystem is wiped at boot |
157 |
anyway (as /tmp is in many installations/). Zeroed out files on recovery |
158 |
can and do happen in writeback mode. |
159 |
|
160 |
data=ordered is the middle ground, "good enough" for most people, both in |
161 |
performance and in data integrity. The system ensures that the commit of |
162 |
the real data itself is "ordered" before the metadata that indexes it, |
163 |
telling the filesystem where it's located. This comes at a slight |
164 |
performance cost as some write-order-optimization must be skipped, but it |
165 |
GREATLY enhances the integrity of the data in the event of a chaotic |
166 |
shutdown and subsequent recovery. There are corner-cases where it's still |
167 |
possible at least in theory to get the wrong behavior, but in practice, |
168 |
these don't happen very often, and when they do, the loss tends to be that |
169 |
of reverting to the pre-update version of the file, losing only the |
170 |
current edit, rather than zeroing out of the file (or worse yet, data |
171 |
leakage) entirely. |
172 |
|
173 |
data=journal is the paranoid option. With this you'll want a much larger |
174 |
journal, because not only the metadata, but the data itself, is |
175 |
journaled. (And here most people thought that's what journaling did /all/ |
176 |
the time!) Because ALL data is ultimately written TWICE in this mode, |
177 |
first to the journal and then from there to its ultimate location, by |
178 |
definition it's a factor of two slower, but provided the hardware is |
179 |
working correctly, the worst-case in a chaotic shutdown is loss of the |
180 |
current edit, reverting to the previous edition of the file. |
181 |
|
182 |
FWIW and rather ironically, my original understanding of all this came |
183 |
from a series of IBM DeveloperWorks articles written in the early kernel |
184 |
2.4 series era, explaining the main filesystem choices, many of them then |
185 |
new, available in kernel 2.4. While the performance data and some |
186 |
filesystem implementation detail (plus lack of mention of ext4 and btrfs |
187 |
as this was before their time) is now somewhat dated, the theory and |
188 |
general filesystem descriptions remain solid, and as such, the series |
189 |
remains a reasonably good intro to Linux filesystems to this day. As |
190 |
such, parts of it are still available as linked from the Gentoo |
191 |
Documentation archived copy of those IBM DeveloperWorks articles. In |
192 |
particular, two parts covering ext3 and the data= options remain available: |
193 |
|
194 |
http://www.gentoo.org/doc/en/articles/afig-ct-ext3-intro.xml |
195 |
http://www.gentoo.org/doc/en/articles/l-afig-p8.xml |
196 |
|
197 |
The ironic bit is who the author was, one Daniel Robbins, the same DRobbins |
198 |
who founded the then Enoch Linux, now Gentoo. But I read them long before |
199 |
I ever considered Gentoo, when I was first switching to Linux and using |
200 |
Mandrake. It was thus with quite some amazement a number of years later, |
201 |
after I'd been on Gentoo for awhile, that I discovered that the *SAME* |
202 |
DRobbins who founded Gentoo (and was still active tho on his way out in |
203 |
early 2004 when I started on Gentoo), was the guy who wrote the Advanced |
204 |
Filesystem Implementor's Guide in IBM DeveloperWorks, the guide I'd found |
205 |
so *INCREDIBLY* helpful years before, when I hadn't a /clue/ who he was or |
206 |
what distribution I'd chose years later, as I just starting with Mandrake |
207 |
and trying to figure out what filesystems to choose. |
208 |
|
209 |
As to the ext3/ext4 differences... AFAIK the (second) biggest one is that |
210 |
ext4 uses extents by default, thus fragmenting files somewhat less over |
211 |
time. (Extents are a subject worth their own post, which I won't attempt |
212 |
as while I understand the basics I don't understand all the implications |
213 |
thereof myself. But one effect is better efficiency in filesystem layout, |
214 |
when the filesystem was created with them anyway... it won't help old |
215 |
files on upgraded-to-ext4-from ext2/3 that much. Google's available for |
216 |
more. =:^) |
217 |
|
218 |
There's a lot of smaller improvements as well. ext4 is native large- |
219 |
filesystem by default. A number of optimizations discovered since ext3 |
220 |
are implemented in ext4 that can't be in ext3 for stability and/or old- |
221 |
kernel backward compatibility reasons. ext4 has a no-journal option |
222 |
that's far better on flash-based thumb-drives, etc. There are a number of |
223 |
options that can make it better on SSDs and flash in general than ext3. |
224 |
|
225 |
And the biggest advantage is that ext4 is actively supported in the kernel |
226 |
and supports ext2/3 as well, while ext2/3, as separate buildable kernel |
227 |
options, are definitely considered legacy, with talk, as I believe I |
228 |
mentioned, of removing them as separate implementations entirely, relying |
229 |
on ext4's backward compatibility for ext2/3 support. In that regard, ext3 |
230 |
as a separate option is in worse shape than reiserfs, since it's clearly |
231 |
legacy and targeted for removal. As part of ext4, support will |
232 |
*DEFINITELY* continue for YEARS, more likely DECADES, so is in no danger |
233 |
in that regard (more so than reiserfs support, which will continue to be |
234 |
supported as well for at least years), but the focus is definitely on ext4 |
235 |
now, and as ext3 becomes more and more legacy, the chances of corner-case |
236 |
bugs appearing in ext3-only code in the ext4 driver do logically |
237 |
increase. In that regard, reiserfs could actually be argued to be in |
238 |
better shape, since it's not implemented as a now out-of-focus older- |
239 |
brother to a current filesystem, so while it has less focus in general, it |
240 |
also has less chances of being accidentally affected by a change to the |
241 |
current-focus code. |
242 |
|
243 |
Which can be argued to have already happened with the default ext3 |
244 |
switching to data=writeback for a number of kernels, before being switched |
245 |
back to the data=ordered it always had before. A number of kernels ago |
246 |
(2.6.29 IIRC), ext4 was either officially just out of or being discussed |
247 |
for bringing out of experimental. I believe it was Ubuntu that first made |
248 |
it a rootfs system install option, in that same time period. Shortly |
249 |
thereafter, a whole slew of Ubuntu on ext4 users, most of whom it turned |
250 |
out later were using the closed nVidia driver, which was unstable in that |
251 |
version against that Ubuntu version and kernel, thus provoking many cases |
252 |
of "chaotic shutdown", a classic worst-case trial-by-fire test for the |
253 |
then still coming out of experimental ext4, began experiencing the classic |
254 |
"zeroed out file" problems on reboot after their chaotic shutdowns. |
255 |
|
256 |
*Greatly* compounding the problem were some seriously ill-advised Gnome |
257 |
config-file behaviors. Apparently, they were opening config-files for |
258 |
read-write simply to READ them and get the config in the process of |
259 |
initializing GNOME. Of course, the unstable nVidia driver was |
260 |
initializing in parallel to all this, with the predictable-in-hindsight |
261 |
results... As gnome was only READING the config values, it SHOULD have |
262 |
opened those files READ-ONLY, if necessary later opening them read-write |
263 |
to write new values to them. As with the security defense-in-depth |
264 |
mentioned in the HBGary parenthetical above, this is pretty basic |
265 |
filesystem principles, but the gnome folks had it wrong. The were opening |
266 |
the files read/write when they only needed read, and the system was |
267 |
crashing with them in that state. As a result, these files were open for |
268 |
writing in the crash, and as is standard security practice as explained |
269 |
above, the ext4 journaling system, defaulting to write-back mode, restored |
270 |
them as zeroed out files to prevent any possibility of data leak. |
271 |
Actually, there were a few other technicalities involved as well (file |
272 |
renaming on write, failure to call fsync, due in part to ext3's historic |
273 |
bad behavior on fsync, which it treated as whole-filesystem-sync, etc), |
274 |
but that's the gist of things. |
275 |
|
276 |
So due to ext4's data=writeback and the immaturity of the filesystem such |
277 |
that it didn't take additional precautions, these folks were getting |
278 |
critical parts of their gnome config zeroed out every time they crashed, |
279 |
and due to the unstable nVidia drivers, they were crashing frequently!! |
280 |
|
281 |
*NOT* a good situation, and that's a classic understatement!! |
282 |
|
283 |
The resulting investigation discovered not only the obvious gnome problem, |
284 |
but several code tweaks that could be done to ext4 to reduce the |
285 |
likelihood of this sort of situation in the future. |
286 |
|
287 |
All fine and good, so far. But they quickly realized that the same sort |
288 |
of code tweak issues existed with ext3, except that because ext3 defaulted |
289 |
to data=ordered, only those specifically setting data=writeback were |
290 |
having problems, and because those using data=writeback were expected to |
291 |
have /some/ problems anyway, the issues had been attributed to that and |
292 |
thus hadn't been fully investigated and fixed, all these years. |
293 |
|
294 |
So they fixed the problems in ext3 as well. Again, all fine and good -- |
295 |
the problems NEEDED fixed. *BUT*, and here's where the controversy comes |
296 |
in, they decided that data=writeback was now dependable enough for BOTH |
297 |
ext3 and ext4, thus changing the default for ext3. |
298 |
|
299 |
To say that was hugely controversial is an understatement (multiple |
300 |
threads on LKML, LWN, elsewhere where the issue was covered at the time, |
301 |
often several hundreds of posts long each), and my feelings on |
302 |
data=writeback should be transparent by now so where I stand on the issue |
303 |
should be equally transparent, but Linus never-the-less merged the commit |
304 |
that switched ext3 to data=writeback by default, AFAIK in 2.6.31. (AFAIK, |
305 |
they discovered the problem in 2.6.29, 2.6.30 contained temporary work- |
306 |
around-fixes, 2.6.31 contained the permanent fixes and switched ext3 to |
307 |
data=writeback.) |
308 |
|
309 |
Here's the critical point. Because reiserfs isn't so closely related to |
310 |
the ext* family, it retained the data=ordered default it had gotten years |
311 |
early, the same kernel Chris Mason committed the code for reiserfs to do |
312 |
data=ordered at all. ext3 got the change due to its relationship with |
313 |
ext4, despite the fact that it's officially an old and stable filesystem |
314 |
where arguably such major policy changes should not occur. If the seperate |
315 |
kernel option for ext3 is removed in ordered to remove the duplicate |
316 |
functionality already included in ext4 for backward compatibility reasons, |
317 |
by definition, this sort of change to ext4 *WILL* change the ext3 it also |
318 |
supports, unless deliberate action is taken to avoid it. That makes such |
319 |
issues far more likely to occur again in ext3, than in the relatively |
320 |
obscure ext4. |
321 |
|
322 |
Meanwhile, as mentioned, with newer kernels (2.6.36, 37, or 38, IDR which, |
323 |
tho it won't matter for those specifying the data=option either via |
324 |
filesystem defaults using tune2fs, or via specific mount option), ext3 |
325 |
reverted again to the older and safer default, data=ordered. |
326 |
|
327 |
And as I said, it's my firm opinion that the data= option has a stronger |
328 |
effect on filesystem stability than any possibly remaining issues with |
329 |
ext4, which is really quite stable by now. Thus, ext3, ext4, or reiserfs, |
330 |
I'd **STRONGLY** recommend data=ordered, regardless of whether it's the |
331 |
default as it is with old and new (but with a gap) ext3 and reiserfs as it |
332 |
has been for years, or not, as I believe ext4 still defaults to |
333 |
data=writeback. If you value your data, "just do it!" |
334 |
|
335 |
Meanwhile, I believe the default on the definitely still experimental |
336 |
btrfs is data=writeback too. While I plan on switching to it eventually, |
337 |
you can be quite sure I'll be examining that default and as of this point, |
338 |
have no intentions of letting it be data=writeback, when I do. |
339 |
|
340 |
.... |
341 |
|
342 |
> The problem with Gentoo was that because EVMS was an orphaned project, I |
343 |
> believe the ebuild wasn't updated. The initrd file was specific for |
344 |
> EVMS. |
345 |
|
346 |
That's quite likely, indeed. |
347 |
|
348 |
> Of course. I like technology that _lasts_! We have a clock in our |
349 |
> house that's about 190 years old [...] turned me on to the Connecticut |
350 |
> Clock and Watch museum, run by one George Bruno [who] also makes working |
351 |
> replicas [and] was able to send me an exact replacement part! Try |
352 |
> _THAT_ with your 1990's era computer ;-) |
353 |
|
354 |
That reminds me... I skipped it as irrelevant to the topic at hand, but |
355 |
due to kernel sensors and ACPI changes, I decided to try the last BIOS |
356 |
upgrade available for this Tyan, after having run an earlier BIOS for some |
357 |
years. Along about 2.6.27, I had to start using a special boot parameter |
358 |
to keep the sensors working, as apparently the sensor address regions |
359 |
overlap ACPI address regions (not an uncommon issue in boards of that era, |
360 |
the kernel folks say). The comments on the kernel bug I filed suggested |
361 |
that a BIOS update might straighten that out (it didn't, BIOS still too |
362 |
old and board EOLed, even if it is still working), so I decided to try it. |
363 |
|
364 |
The problem was that I had a bad memory stick. Now the kernel has |
365 |
detectors for that and I had them active, but the kernel drivers for that |
366 |
were introduced long after I got the hardware, and while it was logging an |
367 |
issue with the memory, since it had been doing that since I activated the |
368 |
kernel drivers for it, I misinterpreted that as simply how it worked, so |
369 |
wasn't aware of the bad memory it was trying to tell me about. |
370 |
|
371 |
So I booted to the FreeDOS floppy I used for BIOS upgrades (I've used |
372 |
FreeDOS for BIOS upgrades for years, without incident before this) and |
373 |
began the process. |
374 |
|
375 |
It crashed half-way thru the flash-burn, apparently when it hit that bad |
376 |
memory!! |
377 |
|
378 |
Bad situation, but there's supposed to be a failsafe direct-read-recover |
379 |
mode built-in, that probably would have worked had I known about it. |
380 |
Unfortunately I didn't, and by the time I figured it out, I'd screwed that |
381 |
up as well. |
382 |
|
383 |
But luckily I have a netbook, that I had intended to put Gentoo on but had |
384 |
never gotten around to at that point (tho it's running Gentoo now, 2.6.38 |
385 |
kernel, kde 4.6.1, fully updated as of mid-March). It was still running |
386 |
the Linpus Linux it shipped with (first full system I've bought since my |
387 |
original 486SX25 w/ 2MB memory and 130 MB hard drive in 1993, or so, and |
388 |
I'd have sooner done without the netbook than pay the MS tax, I DID have |
389 |
to order it from Canada and have it shipped to the US). I was able to get |
390 |
online with that, grab a yahoo webmail account since my mail logins were |
391 |
stuck on the main system without a BIOS, and use that to order a new BIOS |
392 |
chip shipped to me, the target BIOS pre-installed. |
393 |
|
394 |
That new BIOS chip rescued my system! |
395 |
|
396 |
I suspect my feelings after that BIOS chip did the trick rather mirror |
397 |
yours after that gear did the trick for your clock. The computer might |
398 |
not be 190 years old, but 2003 is old enough in computer years, and I |
399 |
suspect I have rather more of my life wound up in that computer than you |
400 |
do in that clock, 190 years old or not. |
401 |
|
402 |
Regardless, tho, you'll surely agree, |
403 |
|
404 |
WHAT A RELIEF TO SEE IT RUNNING AGAIN! =:^) |
405 |
|
406 |
-- |
407 |
Duncan - List replies preferred. No HTML msgs. |
408 |
"Every nonfree program has a lord, a master -- |
409 |
and if you use the program, he is your master." Richard Stallman |