1 |
Mark Knecht posted on Sun, 28 Mar 2010 10:14:03 -0700 as excerpted: |
2 |
|
3 |
> I brought up new hardware yesterday for my first RAID install. I |
4 |
> followed this Gentoo page describing a software RAID1/LVM install: |
5 |
> |
6 |
> http://www.gentoo.org/doc/en/gentoo-x86+raid+lvm2-quickinstall.xml |
7 |
> |
8 |
> Note that I followed this page verbatim, even if it wasn't what I |
9 |
> wanted, with exceptions: |
10 |
> |
11 |
> a) My RAID1 is 3 drives instead of 2 |
12 |
> b) I'm AMD64 Gentoo based. |
13 |
> c) I used grub-static |
14 |
|
15 |
Had you gotten anything off the other list, I see no other replies here. |
16 |
Do you have that install or are you trying over as you mentioned you might? |
17 |
|
18 |
That post was a bit long to quote in full, and somewhat disordered to try |
19 |
to reply per element, so I just quoted the above and will cover a few |
20 |
things as I go. |
21 |
|
22 |
1) I'm running kernel/md RAID here, too (and was formerly running LVM2, |
23 |
which is what I expect you mean by LVM, and I'll continue simply calling |
24 |
it LVM), so I know some about it. |
25 |
|
26 |
2) The Gentoo instructions don't say to, but just in case... you didn't |
27 |
put /boot and / on LVM, only on the RAID-1, correct? LVM is only for non- |
28 |
root non-boot. (Actually, you can put / on LVM, if and only if you run an |
29 |
initrd/initramfs, but it significantly complicates things. Keeping / off |
30 |
of LVM simplifies things considerably, so I'd recommend it.) This is |
31 |
because while the kernel can auto-detect and configure RAID, or the RAID |
32 |
config can be fed to it on the command line. The kernel cannot by itself |
33 |
figure out how to configure LVM -- only the LVM userspace knows how to |
34 |
read and configure LVM, so an LVM userspace and config must be available |
35 |
before it can be loaded. This can be accomplished by using an initrd/ |
36 |
initramfs with LVM loaded on it, but things are MUCH less complex if / |
37 |
isn't LVM, so LVM can be loaded from the normal /. |
38 |
|
39 |
3) You mention not quite understanding how /boot works on md/RAID -- how |
40 |
does grub know where to look? Well, it only works on md/kernel RAID-1, |
41 |
and that only because RAID-1 is basically the same as a non-RAID setup, |
42 |
only instead of one disk, there's several, each a mirror duplicate of the |
43 |
others (but for a bit of RAID metadata). Thus, grub basically treats each |
44 |
disk as if it wasn't in RAID, and it works, because it's organized almost |
45 |
the same as if it wasn't in RAID. That's why you have to install grub |
46 |
separately to each disk, because it's treating them as separate disks, not |
47 |
RAID mirrors. But it doesn't work with other RAID levels because they mix |
48 |
up data stripes, and grub doesn't know anything about that. |
49 |
|
50 |
4) Due to personal experience recovering from a bad disk (pre-RAID, that's |
51 |
why I switched to RAID), I'd actually recommend putting everything portage |
52 |
touches or installs to on / as well. That way, everything is kept in sync |
53 |
and you don't get into a situation where / including /bin /sbin and /etc |
54 |
are a snapshot from one point in time, while portage's database in /var/db |
55 |
is a different one, and stuff installed to /usr may be an entirely |
56 |
different one. Not to mention /opt if you have anything installed |
57 |
there... If all that's on /, then it should all remain in sync. Plus |
58 |
then you don't have to worry about something boot-critical being installed |
59 |
to /usr, which isn't mounted until about midway thru the boot cycle. |
60 |
|
61 |
4 cont) What then goes on other partitions is subdirs of the above, |
62 |
/usr/local, very likely, as you'll probably want to keep it if you |
63 |
reinstall, /home, for the same reason, /var/log, so a runaway log can't |
64 |
eat up all the space on /, it's limited to eating up everything on the log |
65 |
partition, likely /tmp, which I have as tmpfs here but which otherwise you |
66 |
may well want to be RAID-0 for speed, /var/tmp, which here is a symlink to |
67 |
my /tmp so it's on tmpfs too, very possibly /usr/src and the linux kernel |
68 |
tree it contains, as RAID-0 is fine for that as it can simply be |
69 |
redownloaded off the net if need be, same with your portage dir, |
70 |
/usr/portage by default tho you can point that elsewhere (maybe to the |
71 |
same partition holding /usr/src, but if you use FEATURES=buildpkg, you |
72 |
probably want your packagedir on something with some redundancy, so not on |
73 |
the same RAID-0) if you want, etc... If you have a system-wide mail setup |
74 |
with multiple users, you may want a separate mail partition as well (if |
75 |
not, part of /home is fine). Desktop users may well find a separate, |
76 |
likely BIG, partition for their media storage is useful, etc... FWIW, |
77 |
the / partition on my ~amd64 workstation with kde4 is 5 gigs (according to |
78 |
df). On my slightly more space constrained 32-bit netbook, it's 4 gigs. |
79 |
Used space on both is ~2.4 gigs, with the various other partitions as |
80 |
mentioned separate, but with everything portage touches on /. (That |
81 |
compares to what appears to be a 1-gig / md3 root in the guide, with /var |
82 |
and /usr on their own partitions/volumes, but they have an 8 gig /usr, a 4 |
83 |
gig /var, and a 4 gig /opt, totaling 17 gigs, that's mostly on that 4-5 |
84 |
gig /, here.) |
85 |
|
86 |
5) The hexidecimal digits you mentioned during the BIOS post process |
87 |
indicate, as you guessed, BIOS POST and config process progress. I wasn't |
88 |
aware that they're documented, but as your board is an Intel and the link |
89 |
you mentioned appears to be Intel documentation for them, it seems in your |
90 |
case they are, which is nice. =:^) |
91 |
|
92 |
6) Your BIOS has slightly different SATA choices than mine. Here, I have |
93 |
RAID or JBOD (plain SATA, "just a bunch of disks", as my two choices. |
94 |
JBOD mode would compare to your AHCI, which is what I'd recommend. (Seems |
95 |
Intel wants AHCI to be a standard, thus killing the need for individual |
96 |
SATA controller drivers like the SATA_SIL drivers I run here. That'd be |
97 |
nice, but I don't know how well it's being accepted by others.) |
98 |
Compatibility mode will likely slow things down, and RAID mode would be |
99 |
firmware based RAID, which on Linux would be supported by the device- |
100 |
mapper (as is LVM2). JBOD/SATA/AHCI mode, with md/kernel RAID, is |
101 |
generally considered a better choice than firmware RAID with device-mapper |
102 |
support, well, unless you need MSWormOS RAID compatibility, in which case |
103 |
the firmware/device-mapper mode is probably better as it's more compatible. |
104 |
|
105 |
6 cont) So I'd recommend AHCI. However, the on-disk layout may be |
106 |
different between compatibility and AHCI mode, so it's possible the disk |
107 |
won't be readable after switching and you'd need to repartition and |
108 |
reinstall, which you were planning on doing anyway, so no big deal. |
109 |
|
110 |
|
111 |
OK, now that those are covered... what's wrong with your boot? |
112 |
|
113 |
Well, there's two possibilities. Either the BIOS isn't finding grub |
114 |
stage-1, or grub stage-1 is found and loaded, but it can't find stage 1.5 |
115 |
or 2, depending on what it needs for your setup. Either way, that's a |
116 |
grub problem. As long as you didn't make the mistake of putting /boot on |
117 |
your LVM, which grub doesn't groke, and since it can pretend md/kernel |
118 |
RAID-1 is an ordinary disk, we really don't need to worry about the md/ |
119 |
RAID or LVM until you can at LEAST get to the grub menu/prompt. |
120 |
|
121 |
So we have a grub problem. That's what we have to solve first, before we |
122 |
deal with anything else. |
123 |
|
124 |
Based on that, here's the relevant excerpt from your post (well, after a |
125 |
bit of a detour I forgot to include in the above, so we'll call this point |
126 |
7): |
127 |
|
128 |
> NOTE: THIS INSTALL PUTS EVERYTHING ON RAID1. (/, /boot, everything) |
129 |
> I didn't start out thinking I wanted to do that. |
130 |
|
131 |
7) Well, not quite. /boot and / are on RAID-1, yes. But the guide puts |
132 |
the LVM2 physical volume on md4, which is created as RAID-0/striped. I |
133 |
don't really agree with that as striped is fast but has no redundancy. |
134 |
Why you'd put stuff like /home, /usr (including stuff you may well want to |
135 |
keep in /usr/local), /var (including portage's package database in /var/ |
136 |
db), and presumably additional partitions as you may create them (media |
137 |
and mail partitions were the examples I mentioned above) on a non- |
138 |
redundant RAID-0, I don't know. That'd be what I wanted on RAID-1, here, |
139 |
to make sure I still had copies of it if any of the disks died. |
140 |
|
141 |
7 cont) Actually, given that md/raid is now partitionable (years ago it |
142 |
wasn't, with LVM traditionally layered on top to overcome that), and after |
143 |
some experience of my own with LVM, I decided it wasn't worth the hassle |
144 |
of the extra LVM layer here, and when I redid my system last year, I |
145 |
killed the LVM and just use partitioned md/kernel RAID now. If you want |
146 |
the flexibility of LVM, great, but here, I decided it simply wasn't worth |
147 |
the extra hassle of maintaining it. So I'd recommend NOT using LVM and |
148 |
thus not having to worry about it. But it's your choice. |
149 |
|
150 |
OK, now on to the grub issue... |
151 |
|
152 |
> So, the first problem is that on the reboot to see if the install |
153 |
> worked the Intel BIOS reports 'no bootable media found'. I am very |
154 |
> unclear how any system boots software RAID1 before software is loaded, |
155 |
> assuming I understand the Gentoo instructions. The instructions I used |
156 |
> to install grub where |
157 |
> |
158 |
> root (hd0,0) |
159 |
> setup (hd0) |
160 |
> root (hd1,0) |
161 |
> setup (hd1) |
162 |
> root (hd2,0) |
163 |
> setup (hd2) |
164 |
|
165 |
That /looks/ correct. But particularly with RAID, grub's mapping between |
166 |
BIOS drives, kernel drives and grub drives, sometimes gets mixed up. |
167 |
That's one of the things I always hate touching, since I'm never quite |
168 |
sure if it's going to work or not, or that I'm actually telling it to |
169 |
setup where I think I'm telling it to setup, until I actually test it. |
170 |
|
171 |
Do you happen to have a floppy on that machine? If so, probably the most |
172 |
error resistant way to handle it is to install grub to a floppy disk, |
173 |
which unlike thumb drives and possibly CD/DVD drives, has no potential to |
174 |
interfere with the hard drive order as seen by BIOS. Then boot the floppy |
175 |
disk to the grub menu, and run the setup from there. |
176 |
|
177 |
One thing I discovered here is that I could only setup one disk at a time, |
178 |
regardless of whether I was doing it from (in Linux, from a floppy grub |
179 |
menu, or from a bootable USB stick grub boot menu). Changing the root |
180 |
would seem to work after the first setup, but the second setup would have |
181 |
some weird error and testing a boot from that disk wouldn't work, so |
182 |
obviously it didn't take. |
183 |
|
184 |
But doing it a disk at a time, root (hd0,0) , setup (hd0), reboot (or |
185 |
restart grub if doing it from Linux), root (hd1,0), setup (hd1), reboot... |
186 |
same thing for each additional disk (you have three, I have four). THAT |
187 |
worked. |
188 |
|
189 |
However you do it, test them, both with all disks running, and with only |
190 |
one running (turn off or disconnect the others). Having a RAID-1 system |
191 |
and installing grub to all the disks isn't going to do you a lot of good |
192 |
if when one dies, you find that it was the only one that had grub |
193 |
installed correctly! |
194 |
|
195 |
There's another alternative that I'd actually recommend instead, however. |
196 |
The problem with a RAID-1 boot, is that if you somehow screw up something |
197 |
while updating /boot, since it's RAID-1, you've screwed it up for all |
198 |
mirrors on that RAID-1. Since RAID-1 is simply mirroring the data across |
199 |
the multiple disks, it can be better to not RAID that partition at all, |
200 |
but to have each disk have its own /boot partition, un-RAIDed, which |
201 |
effectively becomes a /boot and one more (two in your case of three disks, |
202 |
three in my case of four disks, tho here I actually went with two separate |
203 |
RAID-1s instead) /boot backups. |
204 |
|
205 |
That solves a couple problems at once. First of all, when you first |
206 |
install, you install to just one, as an ordinary disk, test it, and when |
207 |
it's working and booting, you can copy that install to the others, and do |
208 |
the boot sector grub setup on each one separately, as its own disk, having |
209 |
tested that the first one is working. Then you'd test each of the others |
210 |
as well. |
211 |
|
212 |
Second, when you upgrade, especially when you upgrade grub, but also when |
213 |
you upgrade the kernel, you only upgrade the one. If it works, great, you |
214 |
can then upgrade the others. If it fails, no big deal, simply set your |
215 |
BIOS to boot from one of the others instead, and you're back to a known |
216 |
working config, since you had tested it after the /last/ upgrade, and you |
217 |
didn't yet do this upgrade to it since you were just testing this upgrade |
218 |
and it broke before you copied it to your backups. |
219 |
|
220 |
So basically, the only difference here as opposed to the guide, is that |
221 |
you don't create /dev/md1, you configure and mount /dev/sda1 as /boot, and |
222 |
when you have your system up and running, /then/ you go back and setup |
223 |
/dev/sdb1 as your backup boot (say /mnt/boot/). And when you get it setup |
224 |
and tested working, then you do the same thing for /dev/sdc1, except that |
225 |
you can use the same /mnt/boot/ backup mount-point when mounting it as |
226 |
well, since presumably you won't need both backups mounted at once. |
227 |
|
228 |
Everything else will be the same, and as it was RAID-1/mirrored, you'll |
229 |
have about the same space in each /dev/sd[abc]1 partition as you did in |
230 |
the combined md1. |
231 |
|
232 |
As for upgrading the three separate /boot and backups, as I mentioned, |
233 |
when you upgrade grub, DEFINITELY only upgrade one at a time, and test |
234 |
that the upgrade worked and you can boot from it before you touch the |
235 |
others. For kernel upgrades, it doesn't matter too much if the backups |
236 |
are a bit behind, so you don't have to upgrade them for every kernel |
237 |
upgrade. If you run kernel rcs or git-kernels, as I do, I'd suggest |
238 |
upgrading the backups once per kernel release (so from 2.6.32 to 2.6.33, |
239 |
for instance), so the test kernels are only on the working /boot, not its |
240 |
backups, but the backups contain at least one version of the last two |
241 |
release kernels. Pretty much the same if you run upstream stable kernels |
242 |
(so 2.6.33, 2.6.33.1, 2.6.33.2...), or Gentoo -rX kernels. Keep at least |
243 |
on of each of the last two kernels on the backups, tested to boot of |
244 |
course, and only update the working /boot for the stable or -rX bumps. |
245 |
|
246 |
If you only upgrade kernels once a kernel release cycle or less (maybe |
247 |
you're still running 2.6.28.x or something), then you probably want to |
248 |
upgrade and test the backups as soon as you've upgraded and tested a new |
249 |
kernel on the working /boot. |
250 |
|
251 |
Hope it helps... |
252 |
|
253 |
-- |
254 |
Duncan - List replies preferred. No HTML msgs. |
255 |
"Every nonfree program has a lord, a master -- |
256 |
and if you use the program, he is your master." Richard Stallman |