Gentoo Archives: gentoo-amd64

From: Richard Freeman <rich@××××××××××××××.net>
To: gentoo-amd64@l.g.o
Subject: Re: [gentoo-amd64] Re: Disable fsck on boot, was: How to make watchdog start earlier during bootup?
Date: Thu, 22 Jan 2009 20:56:08
Message-Id: 4978DD5D.4040609@thefreemanclan.net
In Reply to: Re: [gentoo-amd64] Re: Disable fsck on boot, was: How to make watchdog start earlier during bootup? by Beso
1 Beso wrote:
2 > well, i think that the lvm2 layer is still good even when used on a
3 > single disk. especially when
4 > you don't know how the partitions would look like. i've had big time
5 > saves by resizing lvm2
6 > array than copying, removing partitions, recreating them and then
7 > recopying files into
8 > the newer ones.
9
10 I tend to agree, but once bitten twice shy. :(
11
12 Some details for the curious:
13
14 I was running lvm2 on top of several raid-5 devices (that is, the raid-5
15 devices were the lmv2 physical volumes). I created the logical volumes
16 on particular pvs to try to optimize disk seeking, so generally speaking
17 particular partitions resided on only one set of disks. However, some
18 partitions did cross both arrays. (When creating lvs you can tell lvm2
19 to try to put them on a particular pv, or you can use pvmove to move
20 particular lvs I believe).
21
22 I was running ext3 on my lvs (and swap).
23
24 The problem was that I was having some kind of glitch that was causing
25 my computer to reset (I traced it to one of my drives), and when it
26 happened the array would sometimes come up with one of the drives
27 missing. If the glitch happened again while the array was degraded it
28 could cause data loss (no worse than not having RAID at all).
29
30 When I finally got the bad drive replaced (which generally fixed the
31 resets), I rebuilt my arrays. At that point mdadm was happy with the
32 state of affairs, but fsck was showing loads of errors on some of my
33 filesystems. When I went ahead and let fsck do its job, I immediately
34 started noticing corrupt files all over the place. The majority of the
35 data volume was mpg files from mythtv and I'd find hour-long TV episodes
36 where one minute of some other show would get spliced in. It seemed
37 obvious that files were somehow getting cross-linked (I'm not intimately
38 familiar with ext3, but I could see how this could happen in FAT). Oh -
39 these errors were on a partition that WASN'T fsck'ed (in the
40 command-line-utility sense of the world only I suppose).
41
42 I also started getting lots of errors on dmesg about attempts to seek
43 past the end of the md devices. I did some googling and found that this
44 had been seen by others - but it was obviously very rare.
45
46 Fortunately all my most critical data is backed up weekly (only a day or
47 two before the final crash), and I didn't care about the TV too much (I
48 saved what I could and re-recorded anything that got truncated or wasn't
49 watchable). I did find that some of my DVD backups of digital photos
50 were unreadable which has taught me a valuable lesson. Fortunately only
51 some of the photos actually had errors in them, and most were
52 successfully backed up.
53
54 I'm not longer using lvm2. If I need to expand my RAID I can
55 potentially reshape it (after backups where possible). I miss some of
56 the flexibility, but when I need a few GB of scratch space to test out a
57 filesystem upgrade or something I just use losetup - but I don't care
58 about performance in these cases.
59
60 I would say that lvm2 is probably safe if you have more reliable
61 hardware. My problem was that a failing drive not only made the drive
62 inaccessible, but it took down the whole system (since hardware on a
63 typical desktop isn't well-isolated). On a decent server a drive
64 failure shouldn't cause errors that bring down the whole system. So, I
65 didn't get the full benefit from RAID.

Attachments

File name MIME type
smime.p7s application/x-pkcs7-signature

Replies