Gentoo Archives: gentoo-amd64

From: Richard Freeman <rich0@g.o>
To: gentoo-amd64@l.g.o
Subject: Re: [gentoo-amd64] Re: Disable fsck on boot, was: How to make watchdog start earlier during bootup?
Date: Fri, 23 Jan 2009 16:53:27
Message-Id: 4979F593.4010108@gentoo.org
In Reply to: [gentoo-amd64] Re: Disable fsck on boot, was: How to make watchdog start earlier during bootup? by Duncan <1i5t5.duncan@cox.net>
1 Duncan wrote:
2 >
3 > I'd blame that on your choice of RAID (and ultimately on the defective
4 > hardware, but it wouldn't have been as bad on RAID-1 or RAID-6), more
5 > than on what was running on top of it.
6
7 Agree - RAID-6 would have helped in this particular circumstance
8 (assuming I didn't lose more than one drive). The non-server hardware
9 still was a big issue. I'm not sure I'd ever go with RAID-6 for
10 personal use - that is a lot of money in non-useful drives.
11
12 > What I'd guess happened is that the dirty/degraded crash happened while
13 > the set of stripes that also had the LVM2 record was being written, altho
14 > it wasn't necessarily the LVM data itself being written, but just
15 > something that happened to be in the same stripe set so the checksum
16 > covering it had to be rewritten as well. It's also possible the hardware
17 > error you mentioned was affecting the reliability of what the spindle
18 > returned even when it didn't cause resets. In that case, even if the
19 > data was on a different stripe, the resulting checksum written could end
20 > up invalid, thus playing havoc with a recovery.
21
22 Sounds likely. I think the lvm2 metadata got corrupted. I'm a big fan
23 of zfs and btrfs (once they're production ready) precisely because they
24 try to address the RAID stripe problem with copy-on-write right down to
25 the physical level.
26
27 > data=ordered is the middle ground and I believe what ext3 has always
28 > defaulted to, and what reiserfs has defaulted to for years.
29
30 Yup - using ordered data. From a metadata integrity standpoint I
31 believe this has been shown to be equivalent to data=journal. As you
32 point out once lvm was hosed that didn't help much.
33
34 > Lucky, or more appropriately wise you! There aren't so many folks that
35 > backup to normally offline external device that regularly. Honestly, I
36 > don't.
37
38 Yeah - I've learned that lesson over time the hard way. I can't backup
39 everything (at least not with a big investment), but I do use dar and
40 par2 to backup everything important. I just create a dar backup weekly,
41 and then run a script on a laptop to copy the data offline. I don't
42 backup anything that requires snapshots (I use a cron job do do a mysql
43 export separately and back that up), so that works fine for me. This is
44 really just my high value data - when my system was hosed I had to
45 reinstall from stage3, but I had all my /etc config files so getting up
46 and running didn't take a huge amount of effort. However, I did learn
47 the hard way that some programs store their actual config files in /var
48 and symlink them into /etc - be sure to catch those in your backups! I
49 ended up having my samba domain controller SID change which was a
50 headache since now all my usernames don't have their old permissions on
51 all my XP workstations). Granted, this is a house with all of four
52 users, which helped with the cleanup.
53
54 > So... I guess that's something else I can add to my list now, for the
55 > next time I setup a new disk set or whatever. To the everything-portage-
56 > touches-on-root that I explained in the other replies, and the RAID-6
57 > that I had already chosen over RAID-5, I can now add to the list killing
58 > the LVM2 used in my current setup.
59 >
60
61 If you have RAID-6 I'm not sure it is worth worrying about getting rid
62 of LVM2. At least, assuming you don't start having
63 multiple-drive-failures (a possibility with desktop hardware with all
64 the drives sharing the same power cords, interfaces, etc).
65
66 If you want to think really long term take a look at btrfs. It looks
67 like it aims to be everything that zfs is (minus the GPL-incompatible
68 license). Definitely not ready for prime time, but the proposed feature
69 set looks better than zfs. I don't like the inability to reshape zfs -
70 you can add more arrays to your system, but you can't add one drive to
71 an existing array (online or offline). Btrfs seems to aim to be able to
72 do this. Again, it is completely experimental at this point - don't use
73 it except to try it out. It will be possible to migrate ext3/4 directly
74 in-place to btrfs, and even reverse the migration (minus any changes -
75 it essentially snapshots the existing data). The only limitation is
76 that if you delete files you won't get the space back until you get rid
77 of the ability to migrate back to ext3 (since it is a snapshot).

Replies