1 |
Mikko Husari <husku@×××××.net> posted 45C82D9C.7010601@×××××.net, |
2 |
excerpted below, on Tue, 06 Feb 2007 09:26:20 +0200: |
3 |
|
4 |
> hi! |
5 |
> |
6 |
> i was wonderin (also tried my luck on perfomance-gentoo, but no one |
7 |
> home), what kind of partition + fs table would be optimal on server |
8 |
> and/or desktop. afaik, /usr/portage would be on its own partition, and |
9 |
> perhaps reiserfs and raid0. distfiles should be on a different |
10 |
> partition, so it would not be in the way of portage itself... but, what |
11 |
> about other parts of gentoo/linux. and is journaling filesystem over |
12 |
> striping raid just asking for trouble? |
13 |
|
14 |
I'm running a 300-gig/disk, 4-SATA-disk, kernel RAID here, and /love/ it! |
15 |
|
16 |
Each of the four disks is partitioned identically to the others, with the |
17 |
corresponding partitions on each grouped in the corresponding RAIDs. |
18 |
|
19 |
/dev/md0 (on /dev/sd[a..d]1) is raid-1, necessary for boot, since that's |
20 |
all grub and the like understand, but that's all that's on the raid-1. |
21 |
|
22 |
The main system is partitioned (mdp) raid-6, /dev/md_d1 on /dev/sd[a..d]2. |
23 |
I have it partitioned three-ways, but it would be four if I were doing it |
24 |
over. The first two RAID partitions (would be three), /dev/md_d1p1 and |
25 |
md_d1p2, are identically sized, containing my main root filesystem and a |
26 |
backup, rootbak. (I'd create rootbak1 and rootbak2, now.) /dev/md_d1p3 |
27 |
(would be p4 or possibly reserving that as extended, make it p5, if I had |
28 |
a second rootbak image) is a huge partition, containing an LVM2 managed |
29 |
physical volume that in turn contains all the critical non-root data I |
30 |
still want to keep on raid-6. |
31 |
|
32 |
This is critical to my strategy -- the kernel itself knows how to deal |
33 |
with RAID, both partitioned and unpartitioned, with any necessary |
34 |
parameters passable on the kernel command line from grub/lilo/whatever. |
35 |
The same CANNOT be said about LVM2, which requires userspace configuration |
36 |
to load. Thus my choice of partitioned RAID for my main root filesystem |
37 |
and its backups. I don't have to hassle an initramfs/initrd with just |
38 |
RAID, as I'd have to do if I put root on LVM2. |
39 |
|
40 |
Equally important are the root and rootbak images and how I manage them. |
41 |
First, a lesson I learned the hard way, root contains everything portage |
42 |
installs to, so /usr (minus some subdirs such as /usr/local and /usr/src) |
43 |
and /var (minus subdirs such as /var/log), as well as the stuff |
44 |
traditionally on root. The problem I ran into before is that I had my |
45 |
/usr partition go bad on me, and the backup version of it I had was |
46 |
somewhat outdated. Thus, the portage database in /var/db/portage was in |
47 |
sync with what was on root, but NOT in sync with what was on /usr, which |
48 |
was some six months outdated. Recovery was possible, but not easy. Thus |
49 |
my current strategy -- keep everywhere portage installs packages to on the |
50 |
same partition, so it's always in sync. If I end up resorting to a backup |
51 |
image, the entire backup image may be outdated, but what's on the disk is |
52 |
always going to be in sync with what's in the portage database on the same |
53 |
partition. |
54 |
|
55 |
Second, the rootbak images are simply snapshots of root, taken |
56 |
periodically when I think the system is relatively stable. I have one. |
57 |
As I said, if I were doing it again, I'd make a second one, alternating |
58 |
backups, so if tragedy struck when I was actually doing the backup, I'd |
59 |
have the other one still intact and ready to boot to. It's important to |
60 |
note that while this doesn't replace off-system backups, it does make up |
61 |
for fat-fingering or the occasional botched upgrade -- and with it on |
62 |
RAID-6, it's relatively hardware failure resistant as well. If I ever |
63 |
can't boot my main root, I simply reboot and tell grub to pass rootbak to |
64 |
the kernel instead of root. What's great about this is that other than |
65 |
being a bit outdated, I have a fully functional system on the backup |
66 |
image, just as I left it when I did the snapshot. No having to fiddle |
67 |
with text-mode links/lynx to google for what went wrong in ordered to |
68 |
recover my main system, as I have a full X and KDE -- and everything else |
69 |
I had installed at the time, sitting there fully configured and ready to |
70 |
run. Even if LVM goes out on me, while I lose /home and the like as they |
71 |
are on the LVM handled raid-6 partition, I have all the system executables |
72 |
including everything on /usr ready to use to solve the problem, should it |
73 |
be necessary to google or use email or the like. (/home is handled by |
74 |
LVM2, but I could create a temp /home and use it to google or whatever, if |
75 |
necessary.) |
76 |
|
77 |
So then as I mentioned, the last raid-6 partition is LVM2 managed and |
78 |
contains logical volumes (which I still refer to as partitions) for all my |
79 |
usual data, /home and a homebak, /var/log (I decided if it died I'd just |
80 |
take the loss, so no bak of it, but it's at least on raid-6), a mail |
81 |
partition and mailbak, news (no bak), a multimedia partition and bak, |
82 |
/usr/local and a bak, ... . With portage I run FEATURES=buildpkg, and |
83 |
want it on the raid-6, so there's a package partition and its bak here as |
84 |
well. |
85 |
|
86 |
/dev/sd[a..d]3 is swap. I have the kernel manage this directly, as with a |
87 |
priority= parameter in the mount options set to the same thing (here, |
88 |
priority=1 for all four partitions, one on each spindle), the kernel will |
89 |
stripe swap as if it was a raid-0. If I needed zero-downtime, I'd put |
90 |
swap on the raid-6 as well, so it could continue to function if I lost up |
91 |
to two disks, but as long as my data is safe, a forced reboot on a lost |
92 |
disk isn't a big deal here, and the speed of striped is nice, so striped |
93 |
it is! |
94 |
|
95 |
/dev/sd[a..d]4 is reserved as the extended partition, thus giving me room |
96 |
to expand beyond four partitions if need be. |
97 |
|
98 |
/dev/sd[a..d]5 is striped/raid-0 (mdp/partitioned), containing stuff that |
99 |
doesn't need the redundancy of the raid-6. /tmp and /var/tmp of course |
100 |
on one partition. The second contains the rest. The portage and kernel |
101 |
trees are easily redownloadable, so /usr/src and $PORTDIR live here. I |
102 |
run ccache, and figure the speed of striped is better than the redundancy |
103 |
of raid-6, so it's here too. Everything except the tmps are on the same |
104 |
partition, in subdirs, with symlinks from their normal location where |
105 |
necessary (as with /usr/src). |
106 |
|
107 |
Sizes? The root images are ~10 gig. The workspace for portage needs to |
108 |
be about 5 gig in some cases. That's normally /var/tmp but I have |
109 |
/var/tmp symlinked to /tmp anyway so put it there and have a fairly large |
110 |
(~25 gig) /tmp. The other striped partition is ~12 gig. The package |
111 |
partitions are ~4 gig. /var/log is ~2 gig. /home (and its bak) are 10 |
112 |
gig, but I have additional dedicated partitions for some stuff people |
113 |
would otherwise store there such as mail and multimedia. Almost |
114 |
everything is less than half full, with only 1.5 gig of the 10 gig root |
115 |
partition and baks actually used, but I wanted to be sure and leave |
116 |
/plenty/ of room there. |
117 |
|
118 |
Filesystems? Here, I'm running 100% reiserfs, thus avoiding having to |
119 |
keep other types loaded (tho I build ext2 as a module, to use with |
120 |
floppies if necessary, and of course iso9660 and udf for CD/DVD, again |
121 |
as modules, only loaded when necessary). I like the reiserfs tail-packing, |
122 |
and haven't had any problems with it since the patches adding data=ordered |
123 |
by default to reiserfs. Some say it's a dead-end, but it's quite stable |
124 |
now and due to the number of systems installed with it, will need to be |
125 |
maintained in the kernel for many years to come. I had intended to |
126 |
eventually upgrade to reiser4, but now, who knows? Anyway, I can't see |
127 |
going to a block system that wastes all that space with small files, and |
128 |
the other alternative that handles that, XFS AFAIK, has issues of its own, |
129 |
particularly on systems without very sturdy UPSs and/or battery-backed |
130 |
SCSI. I figure reiserfs has another three years anyway, before it starts |
131 |
getting crufty, and three years is a long time in kernel and filesystems |
132 |
development time, so what developments and options there might be by then, |
133 |
who knows? (According to the article someone else linked on reiserfs, |
134 |
globalfs looks to be the eventual replacement, but it's not ready yet.) |
135 |
By that time, my SATA hardware will be pretty old as well, so it'll be |
136 |
time to think about upgrades there too, and I'll probably do the |
137 |
filesystem upgrade if I think I need it, at the hardware upgrade. |
138 |
|
139 |
I honestly can't say what filesystem I'd recommend now, but both ext3 |
140 |
(upgrading to 4 when appropriate) and reiserfs should be safe and stable |
141 |
for years to come. There's no reason to shy away from reiserfs now, |
142 |
particularly if you already know and are comfortable with it (it works |
143 |
just fine on RAID). The one thing that CAN be said, however, is that of |
144 |
all the available filesystems, the ext* series is the one best understood |
145 |
by the kernel hackers in general. The others have a limited subset that |
146 |
understand them, but ext2/3/4/... is what Linus and Andrew and the other |
147 |
"core" kernel hackers know best, and thus in some ways could be considered |
148 |
to be the best supported. In that regard, none of the others can equal |
149 |
ext*, despite the fact that its base isn't as modern in some ways as some |
150 |
of the other choices, and it's more modern in the way it integrates with |
151 |
the kernel anyway, just because all the kernel hackers know it and thus do |
152 |
the updates to it right away, that the others have to wait for their |
153 |
groups to pickup on. |
154 |
|
155 |
-- |
156 |
Duncan - List replies preferred. No HTML msgs. |
157 |
"Every nonfree program has a lord, a master -- |
158 |
and if you use the program, he is your master." Richard Stallman |
159 |
|
160 |
-- |
161 |
gentoo-desktop@g.o mailing list |