Re: [gentoo-user] How broken is my raid device /dev/md6? - gentoo-user

From:	Robert David <robert.david.public@×××××.com>
To:	gentoo-user@l.g.o
Cc:	acm@×××.de
Subject:	Re: [gentoo-user] How broken is my raid device /dev/md6?
Date:	Fri, 28 Dec 2012 10:27:39
Message-Id:	`20121228112610.544651cd@gmail.com`
In Reply to:	Re: [gentoo-user] How broken is my raid device /dev/md6? by Alan Mackenzie

1

Hi,

2

3

what does say:

4

5

cat /proc/mdstat

6

7

8

This happened on running system? The root is still running fine I

9

suppose. Try run smartctl test on both drives.

10

11

And do not rebuild or recreate md before you do not know all

12

information, you can terribly broke your root.

13

14

Robert.

15

16

17

On Sun, 23 Dec 2012 12:20:48 +0000

18

Alan Mackenzie <acm@×××.de> wrote:

19

20

> On Sat, Dec 22, 2012 at 03:24:53PM +0100, Volker Armin Hemmann wrote:

21

> > Am Samstag, 22. Dezember 2012, 13:53:42 schrieb Alan Mackenzie:

22

> > > Hi, all.

23

>

24

> > > Just built kernel 3.6.11 and when I tried to install it with

25

> > > lilo, I got this difficult error message:

26

>

27

> > >     Fatal: Trying to map files from unnamed device 0x0000

28

> > > (NFS/RAID mirror down ?)

29

>

30

> > > .  So I eventually had a look at dmesg for my raid setup, and

31

> > > found this

32

> > > - note lines 15 - 19:

33

>

34

> > >     [    2.148410] md: Waiting for all devices to be available

35

> > > before autodetect

36

> > >     [    2.149891] md: If you don't use raid, use

37

> > > raid=noautodetect [    2.151546] md: Autodetecting RAID arrays.

38

> > >     [    2.180356] md: Scanned 4 and added 4 devices.

39

> > >     [    2.181819] md: autorun ...

40

> > >     [    2.183244] md: considering sdb6 ...

41

> > >     [    2.184666] md:  adding sdb6 ...

42

> > >     [    2.186079] md: sdb3 has different UUID to sdb6

43

> > >     [    2.187492] md:  adding sda6 ...

44

> > >     [    2.188884] md: sda3 has different UUID to sdb6

45

> > >     [    2.190484] md: created md6

46

> > >     [    2.191883] md: bind<sda6>

47

> > >     [    2.193224] md: bind<sdb6>

48

> > >     [    2.194538] md: running: <sdb6><sda6>

49

> > > 15  [    2.195855] md: kicking non-fresh sda6 from array!

50

> > > 16  [    2.197154] md: unbind<sda6>

51

> > > 17  [    2.205840] md: export_rdev(sda6)

52

> > >     [    2.207176] bio: create slab <bio-1> at 1

53

> > > 19  [    2.208520] md/raid1:md6: active with 1 out of 2 mirrors

54

> > >     [    2.209835] md6: detected capacity change from 0 to

55

> > > 34359672832 [    2.211187] md: considering sdb3 ...

56

> > >     [    2.212444] md:  adding sdb3 ...

57

> > >     [    2.213691] md:  adding sda3 ...

58

> > >     [    2.215117] md: created md3

59

> > >     [    2.216349] md: bind<sda3>

60

> > >     [    2.217569] md: bind<sdb3>

61

> > >     [    2.218765] md: running: <sdb3><sda3>

62

> > >     [    2.220025] md/raid1:md3: active with 2 out of 2 mirrors

63

> > >     [    2.221231] md3: detected capacity change from 0 to

64

> > > 429507543040 [    2.222508] md: ... autorun DONE.

65

> > >     [    2.230821]  md6: unknown partition table

66

>

67

> > > .  Further perusal of a log file showed this error first happened

68

> > > on 2012-11-29.  It would appear /dev/md6 has been firing on one

69

> > > cylinder ever since, and I've been unaware of this.  :-(

70

>

71

> > > What does it mean for sda6 to be "non-fresh"?

72

>

73

> > > /dev/md6 is my root partition (including /usr :-(), so I can't

74

> > > unmount it for investigation.

75

>

76

> > > Could somebody please suggest how I might go about repairing this

77

> > > problem.

78

>

79

> > boot from systemrescuecd

80

> > mdadm -S /dev/md6

81

> > mdadm -A /dev/md6

82

>

83

> This didn't quite work, since mdadm -A merely restarted the array

84

> without the non-fresh partition.  Still it got me searching, and what

85

> eventually worked was  mdadm /dev/md6 -a /dev/sda6.  (Where -a stands

86

> for "add".) The mdadm man page is very vague for this use case.

87

>

88

>

89

> > get some coffee. Make some popcorn. The resync will take some while.

90

>

91

> Indeed it did.  The coffee settled me down somewhat.  Thanks again!

92

>

1	Hi,
2
3	what does say:
4
5	cat /proc/mdstat
6
7
8	This happened on running system? The root is still running fine I
9	suppose. Try run smartctl test on both drives.
10
11	And do not rebuild or recreate md before you do not know all
12	information, you can terribly broke your root.
13
14	Robert.
15
16
17	On Sun, 23 Dec 2012 12:20:48 +0000
18	Alan Mackenzie <acm@×××.de> wrote:
19
20	> On Sat, Dec 22, 2012 at 03:24:53PM +0100, Volker Armin Hemmann wrote:
21	> > Am Samstag, 22. Dezember 2012, 13:53:42 schrieb Alan Mackenzie:
22	> > > Hi, all.
23	>
24	> > > Just built kernel 3.6.11 and when I tried to install it with
25	> > > lilo, I got this difficult error message:
26	>
27	> > > Fatal: Trying to map files from unnamed device 0x0000
28	> > > (NFS/RAID mirror down ?)
29	>
30	> > > . So I eventually had a look at dmesg for my raid setup, and
31	> > > found this
32	> > > - note lines 15 - 19:
33	>
34	> > > [ 2.148410] md: Waiting for all devices to be available
35	> > > before autodetect
36	> > > [ 2.149891] md: If you don't use raid, use
37	> > > raid=noautodetect [ 2.151546] md: Autodetecting RAID arrays.
38	> > > [ 2.180356] md: Scanned 4 and added 4 devices.
39	> > > [ 2.181819] md: autorun ...
40	> > > [ 2.183244] md: considering sdb6 ...
41	> > > [ 2.184666] md: adding sdb6 ...
42	> > > [ 2.186079] md: sdb3 has different UUID to sdb6
43	> > > [ 2.187492] md: adding sda6 ...
44	> > > [ 2.188884] md: sda3 has different UUID to sdb6
45	> > > [ 2.190484] md: created md6
46	> > > [ 2.191883] md: bind<sda6>
47	> > > [ 2.193224] md: bind<sdb6>
48	> > > [ 2.194538] md: running: <sdb6><sda6>
49	> > > 15 [ 2.195855] md: kicking non-fresh sda6 from array!
50	> > > 16 [ 2.197154] md: unbind<sda6>
51	> > > 17 [ 2.205840] md: export_rdev(sda6)
52	> > > [ 2.207176] bio: create slab <bio-1> at 1
53	> > > 19 [ 2.208520] md/raid1:md6: active with 1 out of 2 mirrors
54	> > > [ 2.209835] md6: detected capacity change from 0 to
55	> > > 34359672832 [ 2.211187] md: considering sdb3 ...
56	> > > [ 2.212444] md: adding sdb3 ...
57	> > > [ 2.213691] md: adding sda3 ...
58	> > > [ 2.215117] md: created md3
59	> > > [ 2.216349] md: bind<sda3>
60	> > > [ 2.217569] md: bind<sdb3>
61	> > > [ 2.218765] md: running: <sdb3><sda3>
62	> > > [ 2.220025] md/raid1:md3: active with 2 out of 2 mirrors
63	> > > [ 2.221231] md3: detected capacity change from 0 to
64	> > > 429507543040 [ 2.222508] md: ... autorun DONE.
65	> > > [ 2.230821] md6: unknown partition table
66	>
67	> > > . Further perusal of a log file showed this error first happened
68	> > > on 2012-11-29. It would appear /dev/md6 has been firing on one
69	> > > cylinder ever since, and I've been unaware of this. :-(
70	>
71	> > > What does it mean for sda6 to be "non-fresh"?
72	>
73	> > > /dev/md6 is my root partition (including /usr :-(), so I can't
74	> > > unmount it for investigation.
75	>
76	> > > Could somebody please suggest how I might go about repairing this
77	> > > problem.
78	>
79	> > boot from systemrescuecd
80	> > mdadm -S /dev/md6
81	> > mdadm -A /dev/md6
82	>
83	> This didn't quite work, since mdadm -A merely restarted the array
84	> without the non-fresh partition. Still it got me searching, and what
85	> eventually worked was mdadm /dev/md6 -a /dev/sda6. (Where -a stands
86	> for "add".) The mdadm man page is very vague for this use case.
87	>
88	>
89	> > get some coffee. Make some popcorn. The resync will take some while.
90	>
91	> Indeed it did. The coffee settled me down somewhat. Thanks again!
92	>

Gentoo Archives: gentoo-user