Gentoo Archives: gentoo-amd64

From: Duncan <1i5t5.duncan@×××.net>
To: gentoo-amd64@l.g.o
Subject: [gentoo-amd64] Re: Hard drive (installation)
Date: Sat, 31 Aug 2013 10:53:46
Message-Id: pan$4d2f0$1c474ee6$f7fd27ae$a30f4ab1@cox.net
1 Rich Freeman posted on Fri, 30 Aug 2013 21:50:47 -0400 as excerpted:
2
3 > Anybody with a motherboard supporting USB2 almost certainly had a
4 > motherboard supporting PATA at a faster transfer rate.
5
6 [Long winded and thread tangential. TL;DR readers just skip it, but I
7 know some people find these things interesting/useful. There's some SSD
8 deployment discussion further down that I certainly could have used a few
9 months ago!]
10
11 Well, yes, but...
12
13 With my 8-year-old native USB1 (with a USB2 addon card, it wouldn't do
14 USB3 as it didn't have PCIE, only PCI-X, dual socket original 3-digit
15 Opteron maxed out at dual Opteron-290, which are dual-core @ 2.8 GHz, so
16 it was effectively quad-core at 2.8, not /too/ shabby, just getting
17 seriously dated in terms of buses, etc) system that died last year, I
18 setup an unbootable USB external spinning rust drive as a backup, with a
19 USB thumbdrive on the bootable native USB1 for /boot to load the kernel,
20 which could then handle the PCI-card USB2 that the BIOS couldn't, and
21 thus get to the backup root on the external USB2 spinning rust.
22
23 That's how I know that not all external USB devices are bootable, as that
24 external drive wasn't, even when I had it on the bios-supported native
25 USB1, thus the earlier warning about that, and how to work around it.
26
27 But the point is, that setup wasn't /significantly/ slower in normal
28 operation than my fancy multi-disk md-raid, and in fact, was NOTICEABLY
29 faster than my original mdraid-6 setup (thus the discussion on it a few
30 months ago), tho mdraid-1 was a bit faster after I switched to it,
31 particularly for multi-thread read as typically happens during boot.
32
33 Now for transferring hundreds of megabytes (as when actually making the
34 backups), yeah, doing that over the USB2 was a bit slow, roughly
35 comparable to backing up to a different partition on the same spindle
36 internally (with the disk seeks that means, except that backing up to the
37 external was to a separate external physical device so without the USB2
38 bottleneck it should have been faster). However, for ordinary use, the 6
39 gigs of RAM (at one point 8, two 2-gig sticks per socket, but one stick
40 died and I never replaced it) was plenty to cover both my normal memory
41 usage and working set disk cache with reasonable margin to spare as I
42 normally run only 2-3 gigs apps+buffers+cache (top line of free), except
43 when I'm doing updates or large media files and thus have all that cached
44 too.
45
46 But even cold-booting off the USB2 external was enough faster than the
47 mdraid-6 to demonstrate how bad mdraid6 can be, and convince me to switch
48 to mdraid-1, which was indeed faster.
49
50 > I do agree that random access speed does lower the effective rate. My
51 > hard drives are running at 3GB/s transfer rates each on a dedicated
52 > channel, and yet they're probably not any faster than they would have
53 > been under PATA (assuming one drive per cable).
54 >
55 > Hopefully one of these days there will be a decent SSD cache option for
56 > Linux. Bcache is still fairly experimental, and I'm not sure how well
57 > it performs in practice with btrfs - plus it is a device layer and not
58 > filesystem layer implementation (ie if you have mirrored drives you end
59 > up with mirrored cache which seems a bit dumb, especially if the mirrors
60 > end up being on separate partitions on the same device).
61
62 Here, I ended up putting root and home (and log and the gentoo tree and
63 overlays and...) on btrfs on SSD, with, effectively, only my media (and
64 secondary backups) remaining on spinning rust, as I couldn't justify the
65 cost to put it on SSD.
66
67 A 64-gig SSD is now well below the best-price-point knee at least around
68 here, and I calculated that would be minimal for working copy and primary
69 backup of the OS and /home. However, once I looked at prices (as of a
70 few months ago), I found 128 gig SSDs were actually the low end of the
71 price-point-knee, with 256 gig the high end, and decided the over-
72 provisioning would be a good idea due to the limited write-cycle issue on
73 multi-level SSDs, so actually planned on that.
74
75 But when I went in to Fry's Electronics to buy, turned out all their good
76 deals on 128 gig SSDs were selling out faster than they could get them in
77 stock, and they happened to be out of the good deals there. I use public
78 transit and didn't feel like going home empty handed only to come back
79 the day the truck came in, so I ended up with 256 gig SSDs, still at the
80 price-point-knee tho at the upper end of it, and was able to put even
81 MORE on the SSDs -- while still keeping a very healthy nearly 100% over-
82 provisioning, tho I expect that might dwindle a bit over time. (With 16
83 GiB RAM I decided I didn't need swap.)
84
85 I did spend a bit more than planned, buying three of them, two for the
86 main machine, now mostly configured in btrfs raid1 mode in ordered to
87 actually make use of btrfs' data integrity checksumming and scrub
88 abilities, with the third to eventually be used in my netbook (which
89 being SATA2 will bottleneck on that, but the slower ones weren't
90 significantly cheaper, and I expect it'll outlast my netbook, to be used
91 either in my main machine or a netbook replacement, later, at which point
92 I'll probably actually use the full speed, as I do with the other two
93 now). But I've been working extra hours and will have it paid off
94 ~three months from purchase so not so much interest, and I'm happy with
95 it.
96
97 The point being... if you do your partitioning right, you can get 90% of
98 the benefits of full SSD at *FAR* more reasonable costs than full SSD.
99 Basically SSD caching, only without the cache, simply putting the stuff
100 that will truly benefit from SSD on SSD, while leaving the stuff that
101 won't on cheaper spinning rust. I calculated that I only NEEDED about 64
102 GB for the most important stuff, with backups on spinning rust, and that
103 would have left a healthy over-provisioning too. (I figured I could fit
104 what I really wanted on SSD in 32 gig if I had to, but it would have been
105 a pretty tight fit!)
106
107 Here's my gdisk -l output for one of the SSD pair (I've been running GUID
108 Partition Tables, GPT, for some time now, some boilerplate omitted for
109 posting):
110
111 Disk /dev/sda: 500118192 sectors, 238.5 GiB
112 Partition table holds up to 128 entries
113 Partitions will be aligned on 2048-sector boundaries
114 Total free space is 246364781 sectors (117.5 GiB)
115
116 No. Start(sector) End(sector) Size Code Name
117 1 2048 8191 3.0 MiB EF02 bi0238gcn1+35l0
118 2 8192 262143 124.0 MiB EF00 ef0238gcn1+35l0
119 3 262144 786431 256.0 MiB 8300 bt0238gcn1+35l0
120 4 786432 2097151 640.0 MiB 8300 lg0238gcn1+35l0
121 5 2097152 18874367 8.0 GiB 8300 rt0238gcn1+35l0
122 6 18874368 60817407 20.0 GiB 8300 hm0238gcn1+35l0
123 7 60817408 111149055 24.0 GiB 8300 pk0238gcn1+35l0
124 8 111149056 127926271 8.0 GiB 8300 nr0238gcn1+35l0
125 9 127926272 144703487 8.0 GiB 8300 rt0238gcn1+35l1
126 10 144703488 186646527 20.0 GiB 8300 hm0238gcn1+35l1
127 11 186646528 236978175 24.0 GiB 8300 pk0238gcn1+35l1
128 12 236978176 253755391 8.0 GiB 8300 nr0238gcn1+35l1
129
130 I wanted my partitions 4 MiB aligned for efficient erase-block handling,
131 tho the first one's only 1 MiB aligned.
132
133 One of the features of GPT partitioning is that it allows partition
134 names/labels much like filesystems normally do. That's what's shown in
135 the last column. I have a standard naming scheme, developed back when I
136 was running multiple mdraid devices, some of which were themselves
137 partitioned, on a 4-spindle set, that I use for both my partition labels
138 and my filesystem labels, for both my main machine and my netbook:
139
140 * 15 characters long
141 123456789012345
142 ff bbB ymd
143 ssssS t n
144
145 Using a different example from my mdraid days:
146
147 rt0005gmd3+9bc0
148
149 ff: 2-char function abbreviation
150
151 bi=bios (gpt dedicated legacy BIOS boot partition, used by grub2).
152 ef=efi (gpt dedicated efi partition, unused here ATM but reserved for
153 forward compatibility)
154 bt=boot, lg=log, rt=root, hm=home, lg=log, pk=package (gentoo tree,
155 layman overlays, sources, binpkgs, kernel tree), nr=netbook-root
156 (separate 32-bit chroot build-image filesystem/partition).
157
158 Example rt=root
159
160 Device-ID consisting of ssssSbbB:
161 ssssS: 4-digit size, 1-char multiplier. This is the size of the
162 underlying/containing media, NOT the partition/filesystem (I'm IDing the
163 containing device).
164
165 bbB: 2-char media brand ID, 1-digit sequence number.
166
167 The md example is a 5 GiB mdraid volume (/dev/md3), which might itself be
168 partitioned (IIRC I was keeping /usr/local/ on a separate filesystem/
169 partition on the same mdraid as root, back then).
170
171 In the above output, 0238 GiB, corsair neutron. The paired SSDs are 0
172 and 1 (0 is installed as sdb, 1 as sda, as I reversed them in
173 installation). So the device-ids are 0238gcn1 and 0238gcn0, with the
174 common btrfs on top of both device-IDed as 0238gcnx.
175
176 t: single-char target/separator. This serves as both a target ID and a
177 visual separator.
178
179 I use . for my netbook (dot 'cause it's small), + for the main machine,
180 and % for portable disk partitions intended to be used on both.
181
182 Both the output and the md example are for my main machine, so +.
183
184 ymd: 1-char-each year/month/day.
185
186 y=last digit of year (I might use storage devices for years but it's
187 unlikely I'll need to track decade wrap).
188 m=month (1-9abc)
189 d=day (1-9a-v)
190
191 This is generally the day I setup the partition. Back when I was still
192 on MBR I used to relabel the filesystem on my backup partitions with the
193 new date when I did a mkfs and a clean backup, but when I switched to GPT
194 and could label the partitions too I decided to keep them date-matched as
195 my fstab uses LABEL= mounting, and it was a hassle updating the fstab
196 when I blew away an old backup and redid it.
197
198 The md example is 2009-11-12. The table is using 2013-05-21.
199
200 n: 1-digit copy number. The working copy is 0, primary backup 1...
201
202 The md example is the working copy. The table has the working copy and
203 for some partitions the primary backup.
204
205
206 mdraid example all together: rt0005gmd3+9bc0
207 rt root
208 0005gmd3 5-gig /dev/md3
209 + for use on the main machine
210 9bc 2009-11-12 (Nov 12)
211 0 working copy
212
213
214 So from the above you can see I have a 3 MiB BIOS partition (grub2 uses,
215 per-drive, size chosen to put all later partitions on 4 MiB boundaries),
216 a 124 MiB EFI partition (reserved for future use, later partitions are
217 now on 128 MiB boundaries).
218
219 A 256 MiB /boot comes next. (This is a separate filesystem for each
220 drive, the second one for backup, working copy updated every time I
221 install a new kernel, which I do a lot as I run git kernels, backup
222 updated once per cycle with the 3.x.0 release, I can direct the BIOS at
223 either one or at the spinning rust /boot, my secondary /boot backup).
224
225 That's followed by /var/log at 640 MiB, leaving me at 1 GiB boundaries.
226 This partition and later are btrfs raid1 mode, but I only keep the
227 working copy of log, not the working and backup copies I keep for
228 everything else.
229
230 Then come the root (8 GiB), home (20 GiB), package (24 GiB), and netbook-
231 root-build-image (8 GiB) partitions, working copies followed by primary
232 backups. Secondary backups and media remain as I said on spinning rust.
233 That's the bulk of my data but it's also the least speed-critical.
234
235 As you can see with a bit of math (I can cheat here and just look it up
236 in gdisk), that's 121 GiB used, 117.5 GiB free, just worse than 50/50, so
237 near 100% over-provisioning. I shouldn't have to worry too much about
238 write-cycle wear-out with that, even if I add a few gigs of partitions
239 later. (Recommended SSD overprovisioning is 25-33%, thus actually using
240 3/4-4/5. I'm barely over half! =:^)
241
242 With a bit more squeezing I could have fit home and both main root and
243 netbook root on a 32-gig with no overprovisioning, and a 64 gig would
244 have fit 1 copy of everything listed, with the primary backups on
245 spinning rust, and that or an 80 gig to allow a bit of overprovisioning
246 is what I was originally targeting. Then I looked at the prices and saw
247 128 gigs at not /that/ much over 80 gigs and at a cheaper per-gig, so
248 that's what I actually thought I'd buy. But I'm not complaining about
249 the 256 gig (238 GiB, unfortunately they're carrying over the marketing
250 practices from spinning rust, I AM complaining about that!), and I'm
251 *DEFINITELY* not complaining about boot or emerge --sync speed on the
252 SSDs! =:^)
253
254 --
255 Duncan - List replies preferred. No HTML msgs.
256 "Every nonfree program has a lord, a master --
257 and if you use the program, he is your master." Richard Stallman

Replies

Subject Author
Re: [gentoo-amd64] Re: Hard drive (installation) Rich Freeman <rich0@g.o>