1 |
On Sat, Apr 17, 2010 at 3:01 PM, Neil Bothwick <neil@××××××××××.uk> wrote: |
2 |
> On Sat, 17 Apr 2010 14:36:39 -0700, Mark Knecht wrote: |
3 |
> |
4 |
>> Empirically any way there doesn't seem to be a problem. I built the |
5 |
>> new kernel and it booted normally so I think I'm misinterpreting what |
6 |
>> was written in the Wiki or the Wiki is wrong. |
7 |
> |
8 |
> As long as /boot is not on RAID, or is on RAID1, you don't need an |
9 |
> initrd. I've been booting this system for years with / on RAID1 and |
10 |
> everything else on RAID5. |
11 |
> |
12 |
> |
13 |
> -- |
14 |
> Neil Bothwick |
15 |
|
16 |
Neil, |
17 |
Completely agreed, and in fact it's the way I built my new system. |
18 |
/boot is just a partition, / is RAID1 is three partitions marked with |
19 |
0xfd partition type, using metadata=0.90 and assembled by the kernel. |
20 |
I'm using WD RAID Edition drives and an Asus Rampage II Extreme |
21 |
motherboard. |
22 |
|
23 |
It works, however I'm running into the sort of thing I ran into |
24 |
this morning when booting - both md5 and md6 have problems this |
25 |
morning. Random partitions get dropped out. It's never the same ones, |
26 |
and it's sometimes only 1 partition out of three on the same drive - |
27 |
sdc5 and sdc6 aren't found until I reboot, but sda3, sdb3 & sdc3 were. |
28 |
Flakey hardware? What? The motherboard? The drives? |
29 |
|
30 |
I've noticed the entering the BIOS setup screens before allowing |
31 |
grub to take over seems to eliminate the problem. Timing? |
32 |
|
33 |
mark@c2stable ~ $ cat /proc/mdstat |
34 |
Personalities : [raid0] [raid1] |
35 |
md6 : active raid1 sda6[0] sdb6[1] |
36 |
247416933 blocks super 1.1 [3/2] [UU_] |
37 |
|
38 |
md11 : active raid0 sdd1[0] sde1[1] |
39 |
104871936 blocks super 1.1 512k chunks |
40 |
|
41 |
md3 : active raid1 sdc3[2] sdb3[1] sda3[0] |
42 |
52436096 blocks [3/3] [UUU] |
43 |
|
44 |
md5 : active raid1 sdb5[1] sda5[0] |
45 |
52436032 blocks [3/2] [UU_] |
46 |
|
47 |
unused devices: <none> |
48 |
mark@c2stable ~ $ |
49 |
|
50 |
For clarity, md3 is the only one needed to boot the system. The |
51 |
other three RAIDs aren't required until I start running apps. However |
52 |
they are all being assembled by the kernel at boot time and I would |
53 |
prefer not to do that, or at least learn how not to do it. |
54 |
|
55 |
Now, as to why they are being assembled I suspect it's because I |
56 |
marked them all with partition type 0xfd when possibly it's not the |
57 |
best thing to have done. The kernel won't bother with non-0xfd |
58 |
partitions and then mdadm could have done it later: |
59 |
|
60 |
c2stable ~ # fdisk -l /dev/sda |
61 |
|
62 |
Disk /dev/sda: 500.1 GB, 500107862016 bytes |
63 |
255 heads, 63 sectors/track, 60801 cylinders |
64 |
Units = cylinders of 16065 * 512 = 8225280 bytes |
65 |
Disk identifier: 0x8b45be24 |
66 |
|
67 |
Device Boot Start End Blocks Id System |
68 |
/dev/sda1 * 1 7 56196 83 Linux |
69 |
/dev/sda2 8 530 4200997+ 82 Linux swap / Solaris |
70 |
/dev/sda3 536 7063 52436160 fd Linux raid autodetect |
71 |
/dev/sda4 7064 60801 431650485 5 Extended |
72 |
/dev/sda5 7064 13591 52436128+ fd Linux raid autodetect |
73 |
/dev/sda6 30000 60801 247417065 fd Linux raid autodetect |
74 |
c2stable ~ # |
75 |
|
76 |
However the Gentoo Wiki says we are supposed to mark everything 0xfd: |
77 |
|
78 |
http://en.gentoo-wiki.com/wiki/RAID/Software#Setup_Partitions |
79 |
|
80 |
I'm not sure that we good advice or not for RAIDs that could be |
81 |
assembled later but that's what I did and it leads to the kernel |
82 |
trying to do everything before the system is totally up and mdadm is |
83 |
really running. |
84 |
|
85 |
Anyway, the failures happen, so I can step through and fail, remove |
86 |
and add the partition back to the array. (In this case fail and remove |
87 |
aren't necessary) |
88 |
|
89 |
c2stable ~ # mdadm /dev/md5 -f /dev/sdc5 |
90 |
mdadm: set device faulty failed for /dev/sdc5: No such device |
91 |
c2stable ~ # mdadm /dev/md5 -r /dev/sdc5 |
92 |
mdadm: hot remove failed for /dev/sdc5: No such device or address |
93 |
c2stable ~ # mdadm /dev/md5 -a /dev/sdc5 |
94 |
mdadm: re-added /dev/sdc5 |
95 |
c2stable ~ # mdadm /dev/md6 -a /dev/sdc6 |
96 |
mdadm: re-added /dev/sdc6 |
97 |
c2stable ~ # |
98 |
|
99 |
At this point md5 is repaired and I'm waiting for md6 |
100 |
|
101 |
c2stable ~ # cat /proc/mdstat |
102 |
Personalities : [raid0] [raid1] |
103 |
md6 : active raid1 sdc6[2] sda6[0] sdb6[1] |
104 |
247416933 blocks super 1.1 [3/2] [UU_] |
105 |
[====>................] recovery = 22.0% (54525440/247416933) |
106 |
finish=38.1min speed=84230K/sec |
107 |
|
108 |
md11 : active raid0 sdd1[0] sde1[1] |
109 |
104871936 blocks super 1.1 512k chunks |
110 |
|
111 |
md3 : active raid1 sdc3[2] sdb3[1] sda3[0] |
112 |
52436096 blocks [3/3] [UUU] |
113 |
|
114 |
md5 : active raid1 sdc5[2] sdb5[1] sda5[0] |
115 |
52436032 blocks [3/3] [UUU] |
116 |
|
117 |
unused devices: <none> |
118 |
c2stable ~ #c2stable ~ # cat /proc/mdstat |
119 |
Personalities : [raid0] [raid1] |
120 |
md6 : active raid1 sdc6[2] sda6[0] sdb6[1] |
121 |
247416933 blocks super 1.1 [3/2] [UU_] |
122 |
[====>................] recovery = 22.0% (54525440/247416933) |
123 |
finish=38.1min speed=84230K/sec |
124 |
|
125 |
md11 : active raid0 sdd1[0] sde1[1] |
126 |
104871936 blocks super 1.1 512k chunks |
127 |
|
128 |
md3 : active raid1 sdc3[2] sdb3[1] sda3[0] |
129 |
52436096 blocks [3/3] [UUU] |
130 |
|
131 |
md5 : active raid1 sdc5[2] sdb5[1] sda5[0] |
132 |
52436032 blocks [3/3] [UUU] |
133 |
|
134 |
unused devices: <none> |
135 |
c2stable ~ # |
136 |
|
137 |
How do I get past this? It's happening 2-3 times a week! I'm |
138 |
figuring if the kernel doesn't auto-assemble the RAIDs that I don't |
139 |
need assembled then I can somehow check that all the partitions are |
140 |
ready to go before I start them up. This exercise this morning will |
141 |
have taken an hour before I can start using the machine. |
142 |
|
143 |
- Mark |
144 |
|
145 |
- Mark |