1 |
On 08/06/2013 23:37, Tanstaafl wrote: |
2 |
> Hi everyone, |
3 |
> |
4 |
> What is best practice for doing this? |
5 |
> |
6 |
> If I reboot in single user mode, will my lvm volumes (ie, /var) be |
7 |
> available for fsck'ing, or do I have to mount them first? |
8 |
> |
9 |
> The current problem started after a different problem required me to do |
10 |
> a hard reset on the server - had to do with a mounted QNAP device being |
11 |
> unavailable when I initiated a reboot, and everything just hung. |
12 |
> |
13 |
> Ever since I did this hard reset, the server hangs at unmounting /var. |
14 |
> I've let it sit there for at least an hour, and it never goes past that. |
15 |
> |
16 |
> Then after I hard reset it, it fsck's /var partition again, maybe fixes |
17 |
> minor problems very quickly, and everything works fine until I have to |
18 |
> reboot or shutdown again. |
19 |
> |
20 |
> This became a major problem this weekend when we had one extended power |
21 |
> outage (about 8 hours) yesterday evening, then another one (about 4 |
22 |
> hours) this morning right after I got everything back up and running |
23 |
> from last nights outage. |
24 |
> |
25 |
> Anyway, I need to do this this weened if at all possible, so... |
26 |
> |
27 |
> Anyone have any pointers to detailed docs and or willing to hold my hand |
28 |
> through this a little? |
29 |
|
30 |
|
31 |
fsck'ing that filesystem should be no different from any other fsck - it |
32 |
should find what it finds and fix what it can. The fs must be unmounted |
33 |
of course which means you have to do it in single-user mode, or from |
34 |
booting a rescue system (I prefer the second, I find it easier as none |
35 |
of the production filesystems are required to be mounted). |
36 |
|
37 |
fsck.resiserfs has several modes, IIRC there's --rebuild-tree or similar |
38 |
that does an extensive checks but takes ages. I needed to do this 2 or 3 |
39 |
times when I was still using reiser. There's also an option to do not |
40 |
writes if you want a sanity check first. |
41 |
|
42 |
I'm not convinced a power outage broke the fs so that you now can't |
43 |
umount it, I'm having a hard time imaging how that would happen. More |
44 |
likely some other script file elsewhere is damaged and leaves files open |
45 |
when the system wants to umount /var. |
46 |
|
47 |
You have some options: |
48 |
|
49 |
This requires considerable downtime, easily an hour or more. You can dd |
50 |
/var somewhere to get a copy you can experiment on with another host. At |
51 |
least you will then know how much downtime to schedule. |
52 |
|
53 |
You should do a full check and repair on all filesystems to be 100% certain. |
54 |
|
55 |
For the umount issues, that is trickier as you won't have log files in |
56 |
/var after the fact. Any clues on the Alt-F12 console whilst shutting |
57 |
down? Try configure your syslogger to send logs to another host, you |
58 |
might be lucky enough to get some logs that way that describe what is |
59 |
going on. |
60 |
-- |
61 |
Alan McKinnon |
62 |
alan.mckinnon@×××××.com |