Gentoo Archives: gentoo-user

From: Rich Freeman <rich0@g.o>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] Seagate ST8000NM0065 PMR or SMR plus NAS SAS SATA question
Date: Fri, 22 May 2020 17:20:33
Message-Id: CAGfcS_=gTSCBwv-+mjcvhzB6m0gSmdzcPCFdrWbYy=6L-8M7Ng@mail.gmail.com
In Reply to: Re: [gentoo-user] Seagate ST8000NM0065 PMR or SMR plus NAS SAS SATA question by antlists
1 On Fri, May 22, 2020 at 12:47 PM antlists <antlists@××××××××××××.uk> wrote:
2 >
3 > What puzzles me (or rather, it doesn't, it's just cost cutting), is why
4 > you need a *dedicated* cache zone anyway.
5 >
6 > Stick a left-shift register between the LBA track and the hard drive,
7 > and by switching this on you write to tracks 2,4,6,8,10... and it's a
8 > CMR zone. Switch the register off and it's an SMR zone writing to all
9 > tracks.
10
11 Disclaimer: I'm not a filesystem/DB design expert.
12
13 Well, I'm sure the zones aren't just 2 tracks wide, but that is worked
14 around easily enough. I don't see what this gets you though. If
15 you're doing sequential writes you can do them anywhere as long as
16 you're doing them sequentially within any particular SMR zone. If
17 you're overwriting data then it doesn't matter how you've mapped them
18 with a static mapping like this, you're still going to end up with
19 writes landing in the middle of an SMR zone.
20
21 > The other thing is, why can't you just stream writes to a SMR zone,
22 > especially if we try and localise writes so lets say all LBAs in Gig 1
23 > go to the same zone ... okay - if we run out of zones to re-shingle to,
24 > then the drive is going to grind to a halt, but it will be much less
25 > likely to crash into that barrier in the first place.
26
27 I'm not 100% following you, but if you're suggesting remapping all
28 blocks so that all writes are always sequential, like some kind of
29 log-based filesystem, your biggest problem here is going to be
30 metadata. Blocks logically are only 512 bytes, so there are a LOT of
31 them. You can't just freely remap them all because then you're going
32 to end up with more metadata than data.
33
34 I'm sure they are doing something like that within the cache area,
35 which is fine for short bursts of writes, but at some point you need
36 to restructure that data so that blocks are contiguous or otherwise
37 following some kind of pattern so that you don't have to literally
38 remap every single block. Now, they could still reside in different
39 locations, so maybe some sequential group of blocks are remapped, but
40 if you have a write to one block in the middle of a group you need to
41 still read/rewrite all those blocks somewhere. Maybe you could use a
42 COW-like mechanism like zfs to reduce this somewhat, but you still
43 need to manage blocks in larger groups so that you don't have a ton of
44 metadata.
45
46 With host-managed SMR this is much less of a problem because the host
47 can use extents/etc to reduce the metadata, because the host already
48 needs to map all this stuff into larger structures like
49 files/records/etc. The host is already trying to avoid having to
50 track individual blocks, so it is counterproductive to re-introduce
51 that problem at the block layer.
52
53 Really the simplest host-managed SMR solution is something like f2fs
54 or some other log-based filesystem that ensures all writes to the disk
55 are sequential. Downside to flash-based filesystems is that they can
56 disregard fragmentation on flash, but you can't disregard that for an
57 SMR drive because random disk performance is terrible.
58
59 > Even better, if we have two independent heads, we could presumably
60 > stream updates using one head, and re-shingle with the other. But that's
61 > more cost ...
62
63 Well, sure, or if you're doing things host-managed then you stick the
64 journal on an SSD and then do the writes to the SMR drive
65 opportunistically. You're basically describing a system where you
66 have independent drives for the journal and the data areas. Adding an
67 extra head on a disk (or just having two disks) greatly improves
68 performance, especially if you're alternating between two regions
69 constantly.
70
71 --
72 Rich

Replies

Subject Author
Re: [gentoo-user] Seagate ST8000NM0065 PMR or SMR plus NAS SAS SATA question antlists <antlists@××××××××××××.uk>