1 |
Peter Humphrey posted <43F20505.3040207@××××××××××.uk>, excerpted below, |
2 |
on Tue, 14 Feb 2006 16:27:49 +0000: |
3 |
|
4 |
> Gavin Seddon wrote: |
5 |
>> From reading these posts I have sorted out 'other' issues that are off |
6 |
>> list. Namely [...] creating a partition for /usr/portage to 'aid' |
7 |
>> fragmentation. |
8 |
> |
9 |
> I did that some time ago in a simple-minded fashion, but I've had to |
10 |
> revise my layout somewhat. I had an ext3 partition solely for |
11 |
> /usr/portage, and it was mounted on that node, but every emerge --sync |
12 |
> deleted the /lost+found directory. I don't know how serious that is, but |
13 |
> of course no-one likes to have damaged file systems on their boxes, so I |
14 |
> used tune2fs -c to set the mount count to 1 so that it was repaired |
15 |
> every time the box booted, but that was taking too long and it was only |
16 |
> a palliative anyway. The solution was easy, though of course I didn't |
17 |
> see it at first! I only had to point /usr/portage to /usr-bits/portage |
18 |
> (/usr-bits being the mount point of the partition) instead of mounting |
19 |
> the partition directly in place. |
20 |
|
21 |
If you have /usr/portage (or more precisely, your portage tree, wherever |
22 |
it exists on your file system, since it can be moved anywhere and the |
23 |
pointer in make.conf changed accordingly) on its own partition, consider |
24 |
making it reiserfs, even if you don't consider reiserfs stable enough for |
25 |
regular use. Correspondingly, those wanting a "safe" but relatively |
26 |
high-use location for testing reiser4 should fine the portage tree a very |
27 |
good choice. |
28 |
|
29 |
Here's the logic: The portage tree (without the packages subdir if you use |
30 |
FEATURES=buildpkg, and with or without the distdir package sources, your |
31 |
call) has two very important characteristics that make it a perfect match |
32 |
for either reiser file system, it has a very high number of very small |
33 |
files -- less than a filesystem block -- and the data in it is ultimately |
34 |
backed up -- available from multiple sources on the net, so easily |
35 |
redownloadable should anything go wrong, thus addressing the distrust |
36 |
issue -- some folks don't consider reiserfs stable enough to store |
37 |
critical data on, and I myself wouldn't consider reiser4 stable enough for |
38 |
non-redundant critical data. |
39 |
|
40 |
When small files are stored on a regular filesystem, including ext2/3, |
41 |
they are stored by block, each file taking up a full filesystem block of |
42 |
data regardless of whether it's a single byte or exactly a block. |
43 |
Likewise, a file a block and a byte long will take up two blocks worth of |
44 |
space. |
45 |
|
46 |
Reiserfs has tail-packing, altho it can be turned off. It stores the |
47 |
<1-block ends of files (the entire file if it's less than a block, the |
48 |
remainder of the file if it's more than a block but not an exact number of |
49 |
blocks) packed together, requiring far less space. The savings can be |
50 |
greater than 50% if the data is all small files. That is, while a regular |
51 |
filesystem will require more than twice the actual data space to store a |
52 |
set of small files, two gigs to store a gig of file data, reiser will |
53 |
require only the single gig to store that same gig of data (plus the |
54 |
journal space, but that's there with any journaled filesystem, including |
55 |
ext3, and of course the metadata storage, that is, the inodes). |
56 |
|
57 |
Likewise, reiserfs has been optimized to make working with small files |
58 |
fast (in case you are wondering, the parallel for large files is xfs), and |
59 |
there are a number of reports floating around on the forums of folks that |
60 |
have switched their portage tree to reiserfs and been shocked at just how |
61 |
much faster emerge --pretend and similar operations turned out to be. |
62 |
Being conservative and because I've never had my portage tree on anything |
63 |
else, so I have no first-hand experience on other filesystems to compare |
64 |
to, I'll only say it shouldn't be /slower/, that is, if those operations |
65 |
take longer on reiserfs, something's definitely wrong, but I'm not going |
66 |
to claim any speedup, only that the storage efficiency is higher, |
67 |
space-wise, and that it isn't slower to access. |
68 |
|
69 |
As I said, some folks are concerned with reiserfs' reputation for |
70 |
instability. I haven't found that to be the case since the kernel |
71 |
defaulted to journal=ordered for reiserfs, but in any case, as long as |
72 |
it's stable enough to keep temporary data on (and it is certainly that), |
73 |
that shouldn't be an issue for the portage tree, since recovery is only an |
74 |
emerge sync away. With an exception for those with infrequent or very |
75 |
expensive per-byte or very slow (analog dialup) internet connections, who |
76 |
probably aren't going to be using Gentoo in any case, there's therefore no |
77 |
data stability issues with the portage tree on reiserfs, even for those |
78 |
that wouldn't trust it with their regular data. |
79 |
|
80 |
That makes reiserfs the best choice for the portage tree, where the |
81 |
portage tree is on its own partition, anyway. Those who don't trust |
82 |
reiserfs for data stability should have no qualms here, because the data |
83 |
is ultimately backed up in any case, and reiserfs /will/ be /far/ more |
84 |
efficient at storing the tree, and /likely/ will be faster, as well, altho |
85 |
I can't personally verify that as I've never run the tree on anything else |
86 |
to compare speed against. |
87 |
|
88 |
As I said, you may want to keep the packages dir, /usr/portage/packages by |
89 |
default, on another partition. This is easiest to accomplish simply by |
90 |
pointing it elsewhere in make.conf. The distdir subdir isn't synced with |
91 |
the portage tree, but contains a local cache of source tarballs that |
92 |
portage has downloaded for various merges. As such, it's ultimately |
93 |
backed up to the internet as well, but because those tarballs are fetched |
94 |
by portage one at a time as it needs them, not synced with the tree, and |
95 |
because the files aren't as small in any case, some folks might want to |
96 |
keep this separate from the portage tree as well, tho the urgency isn't as |
97 |
great here as it would be with packages. |
98 |
|
99 |
All that dealt with, there's only one possibly valid reason I'm aware of, |
100 |
for those already splitting out the portage tree onto its own partition, |
101 |
why they might /not/ wish to use reiserfs. Those that only have ext2/3 |
102 |
configured in their kernel may not wish to bother configuring reiserfs for |
103 |
just the portage tree. I'm actually in that situation with ext2/3 -- I |
104 |
don't have anything on my system using it, so there'd have to be a |
105 |
stronger than usual reason to use it on a particular partition, in ordered |
106 |
to justify the bother of compiling it into the tree. |
107 |
|
108 |
As for the lost&found dir, as someone else mentioned, that's trivial. The |
109 |
only use for it is when the filesystem finds something during an fsck that |
110 |
isn't properly linked, that might still be needed. fsck creates |
111 |
lost&found in the root dir of the filesystem to place these lost files in, |
112 |
as it finds them. It's recreated if needed, so IMO it's actually better |
113 |
/not/ to have a lost&found by default, as that way, if it exists, you know |
114 |
to look in it and see what fsck might have recovered, and either delete it |
115 |
or move it back into its usual place in the tree. Again, because the |
116 |
entire portage tree is recoverable from the net with an emerge sync |
117 |
anyway, it should be entirely safe to ignore a lost&found and have it |
118 |
deleted in a sync, in any case, because the data will be updated with an |
119 |
emerge sync anyway, and it's less hassle to do that than to manually |
120 |
figure out what any files there might be and where they go in the tree, |
121 |
when even if you did, an emerge sync might simply be deleting the file |
122 |
anyway, as outdated. |
123 |
|
124 |
> A word of caution for anyone considering adopting Duncan's scheme |
125 |
> without much thought: what he says is certainly good sense, but don't go |
126 |
> copying him if you're just splitting out bits of the file system that |
127 |
> can be made common to different running systems. I spent half of last |
128 |
> night exploring some of the snags! My aim was to separate some large |
129 |
> slabs of files into their own partitions and mount those partitions on |
130 |
> whichever system I was booting. I have four Linux systems multibooting |
131 |
> on this box, and it seems tidy to find common areas and treat them as |
132 |
> such. |
133 |
> |
134 |
> Don't combine systems' /var/log directories - you will end up with |
135 |
> deeply troubled emerge.log and PORT_LOGDIR records. |
136 |
|
137 |
"Deeply troubled" is an apt description. <g> In any case, combining |
138 |
/var/log dirs makes no sense, because what's the log info there for if not |
139 |
to be able to examine should it be necessary for troubleshooting or record |
140 |
keeping purposes? Throwing the logs from multiple independent boot |
141 |
systems into the same location, with no way to tell what belongs to what |
142 |
system, can only confuse things, and destroys the entire supposition |
143 |
behind having the logs in the first place. (Note that this is different |
144 |
than the convenience of running a central syslog with multiple machines |
145 |
logging to it, because in that case, the log will have machine |
146 |
identification labels to sort out which logged events correspond to which |
147 |
system. Just using the same partition for everything just jumbles |
148 |
everything up in a big mess, defeating the purpose of logging in the first |
149 |
place!) |
150 |
|
151 |
> Don't combine systems' /usr/src directories. It won't do much harm, but |
152 |
> the records of which kernel version is installed in each system will |
153 |
> cause overwriting anyway, thus spoiling the idea, especially if you have |
154 |
> USE=symlink for gentoo-sources. |
155 |
|
156 |
I'd rather say "Know what you're doing if you combine /usr/src dirs." It |
157 |
can be done if the appropriate organization is maintained, but as you |
158 |
point out, automating the /usr/src/linux symlink with USE=symlink for the |
159 |
kernel-sources packages is NOT a good idea if you are running a combined |
160 |
/usr/src. Perhaps a more intelligent solution could be borrowed from |
161 |
Mandrake (and I suppose Mandriva continues the idea, but don't know), |
162 |
where an initscript setup a number of symlinks, and this one could be |
163 |
included. The idea being to set the symlink to point to the sources |
164 |
corresponding to the kernel booted, where that is possible, leaving it |
165 |
alone if there are no sources found that correspond to the booting kernel. |
166 |
|
167 |
So... /usr/src can be multi-boot combined, but it's not as trivial as one |
168 |
might expect if one doesn't consider the consequences, so don't do it |
169 |
without some thought, first. |
170 |
|
171 |
Actually, that "don't do it without some thought, first" can be applied to |
172 |
a /lot/ of thing! =8^) |
173 |
|
174 |
> I'm not yet sure of the wisdom of combining /var/tmp from different |
175 |
> systems: I haven't yet sorted out the consequences for the portage work |
176 |
> directories. Watch this space. |
177 |
|
178 |
By definition, /tmp and /var/tmp should be multi-boot combineable, and |
179 |
combineable between the two, as well (my /var/tmp is simply a symlink to |
180 |
/tmp, altho on a multi-human-user system, there are security issues one |
181 |
should consider before doing it -- yet another place to "don't do it |
182 |
without some thought, first" <g>), because the data is by definition |
183 |
"temporary", which in this case is defined as "not needing to survive a |
184 |
reboot". |
185 |
|
186 |
That "tmp" is defined as "not needing to survive a reboot" is, BTW, the |
187 |
official FHS (File Hierarchy Standard, part of LSB aka Linux Standard |
188 |
Base) definition as well, AFAIK -- the idea being that the practice of |
189 |
certain distributions, deleting everything in /tmp and /var/tmp at boot, |
190 |
is specifically allowed and shouldn't cause any malfunctions. |
191 |
|
192 |
In any case, there's no damage to portage by combining those dirs, or |
193 |
removing the contents at boot, either. If you have something set up |
194 |
locally that saves data across reboots to either /var/tmp or /tmp, |
195 |
consider changing it, as that's not what those dirs are for, and expecting |
196 |
them to be safe for that could get broken at some point. |
197 |
|
198 |
> Apologies if I'm not making much sense today - blame the loss of sleep |
199 |
> and the head cold that probably caused it :-( |
200 |
|
201 |
Actually, great sense! Thanks for bringing up the possibility of |
202 |
multi-boot partition combines! It certainly adds to the information |
203 |
available in the discussion! |
204 |
|
205 |
-- |
206 |
Duncan - List replies preferred. No HTML msgs. |
207 |
"Every nonfree program has a lord, a master -- |
208 |
and if you use the program, he is your master." Richard Stallman in |
209 |
http://www.linuxdevcenter.com/pub/a/linux/2004/12/22/rms_interview.html |
210 |
|
211 |
|
212 |
-- |
213 |
gentoo-amd64@g.o mailing list |