1 |
On Monday, 7 January 2019 16:30:41 GMT Jack wrote: |
2 |
> On 2019.01.07 05:46, Dale wrote: |
3 |
> > Peter Humphrey wrote: |
4 |
> > > On Sunday, 6 January 2019 22:13:31 GMT Dale wrote: |
5 |
> > >> Even from my simple setup, LVM adds more benefits to managing data |
6 |
> > |
7 |
> > and |
8 |
> > |
9 |
> > >> drives than it does risk. The biggest thing, placing blame where |
10 |
> > |
11 |
> > it |
12 |
> > |
13 |
> > >> lies. Blaming LVM for a drive dying is placing the blame on |
14 |
> > |
15 |
> > something |
16 |
> > |
17 |
> > >> that wasn't the root of the problem. The dying drive was the |
18 |
> > |
19 |
> > problem, |
20 |
> > |
21 |
> > >> using LVM or not. |
22 |
> > > |
23 |
> > > He isn't doing that, though. As I read it, he recounted the tale of |
24 |
> > |
25 |
> > recovering |
26 |
> > |
27 |
> > > data from a failed drive, then imagined how much worse it would be |
28 |
> > |
29 |
> > if it were |
30 |
> > |
31 |
> > > in an LVM setup. [Reported speech and mixed-up tenses causing me a |
32 |
> > |
33 |
> > problem |
34 |
> > |
35 |
> > > here...] |
36 |
> > > |
37 |
> > > Thanks Gevisz, that was interesting. What we used to call a |
38 |
> > |
39 |
> > cautionary tale. |
40 |
> > |
41 |
> > |
42 |
> > From what I've read, that can be overcome. If you get say a SMART |
43 |
> > message that a drive is failing, just remove that drive or remove the |
44 |
> > whole LVM setup and use something else until a working drive setup can |
45 |
> > be made. Once ready, then move the data, if the drive still works, to |
46 |
> > the new drive. That is basically what I did when I swapped a smaller |
47 |
> > drive for a larger one. I moved the data from one drive to another. |
48 |
> > It |
49 |
> > did it fairly quickly. Someone posted that it may even be faster to |
50 |
> > do |
51 |
> > it with LVM's pvmove than it is with cp or rsync. I don't know how |
52 |
> > true |
53 |
> > that is but from what I've read, it moves the data really |
54 |
> > efficiently. |
55 |
> > If the drive has a very limited time before failure, speed is |
56 |
> > important. If the drive is completely dead, replace the drive and |
57 |
> > hope |
58 |
> > the backups are good. Either way, LVM or not, a failing drive is a |
59 |
> > failing drive. The data has to be moved if the drive still works or |
60 |
> > the |
61 |
> > data is gone if it just up and dies. The biggest thing, watching the |
62 |
> > SMART messages about the health of the drive. In the past when I've |
63 |
> > had |
64 |
> > a drive fail, I got error messages well ahead of time. On one drive, |
65 |
> > I |
66 |
> > removed the drive, set it aside, ordered a replacement drive, |
67 |
> > installed |
68 |
> > both drives and copied the data over. After I did all that, I played |
69 |
> > with the drive until it failed a day or so later. Lucky? Most |
70 |
> > likely. |
71 |
> > Still, it gave me time to transfer things over. |
72 |
> > |
73 |
> > While I get that LVM adds a layer to things, it also adds some options |
74 |
> > as well. Those options can prove helpful if one uses them. |
75 |
> > |
76 |
> > Just my thinking. |
77 |
> > |
78 |
> > Dale |
79 |
> |
80 |
> The only problem with all that is that SMART is far from completely |
81 |
> reliable. I recently had a drive fail, and the resulting fsck on the |
82 |
> next reboot messed up many files. (Not a Gentoo system, although I |
83 |
> don't think that made any difference.) After getting running again, I |
84 |
> did several SMART tests, including the full self-test, and it reported |
85 |
> ZERO errors. A few weeks later, it did the same thing, and shortly |
86 |
> after that, it failed totally. I had done a few more full self-tests |
87 |
> before final failure, and all came back clean. I'd really love to find |
88 |
> out there was something I did wrong in the testing, but I don't think |
89 |
> so. I have not yet completely given up on trying to recover stuff from |
90 |
> that drive, but as time goes on, there is less and less that I haven't |
91 |
> rebuilt or replaced by re-downloading or changing lost passwords, so |
92 |
> it's less and less important. (That was a different drive from the one |
93 |
> I messed up myself, as discussed in another recent thread here.) |
94 |
> |
95 |
> Jack |
96 |
|
97 |
Depending on the type of errors reported by SMART, by the time you notice |
98 |
errors in tests the risk of losing data is already quite high. Checking |
99 |
deteriorating trends with smartctl won't hurt though. |
100 |
|
101 |
The filesystem problems you were getting may have been coincidental with the |
102 |
impending hardware failure, rather than their cause. |
103 |
|
104 |
-- |
105 |
Regards, |
106 |
Mick |