Gentoo Archives: gentoo-user

From: lee <lee@××××××××.de>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] Re: {OT} Allow work from home?
Date: Sat, 05 Mar 2016 00:24:16
Message-Id: 878u1xn6mu.fsf@heimdali.yagibdah.de
In Reply to: [gentoo-user] Re: {OT} Allow work from home? by Kai Krakow
1 Kai Krakow <hurikhan77@×××××.com> writes:
2
3 > Am Sat, 20 Feb 2016 11:24:56 +0100
4 > schrieb lee <lee@××××××××.de>:
5 >
6 >> > It uses some very clever ideas to place files into groups and into
7 >> > proper order - other than using file mod and access times like other
8 >> > defrag tools do (which even make the problem worse by doing so
9 >> > because this destroys locality of data even more).
10 >>
11 >> I've never heard of MyDefrag, I might try it out. Does it make
12 >> updating any faster?
13 >
14 > Ah well, difficult question... Short answer: It uses countermeasures
15 > against performance after updates decreasing too fast. It does this by
16 > using a "gapped" on-disk file layout - leaving some gaps for Windows to
17 > put temporary files. By this, files don't become a far spread as
18 > usually during updates. But yes, it improves installation time.
19
20 What difference would that make with an SSD?
21
22 > Apparently it's unmaintained since a few years but it still does a good
23 > job. It was built upon a theory by a student about how to properly
24 > reorganize file layout on a spinning disk to stay at high performance
25 > as best as possible.
26
27 For spinning disks, I can see how it can be beneficial.
28
29 >> > But even SSDs can use _proper_ defragmentation from time to time for
30 >> > increased lifetime and performance (this is due to how the FTL works
31 >> > and because erase blocks are huge, I won't get into detail unless
32 >> > someone asks). This is why mydefrag also supports flash
33 >> > optimization. It works by moving as few files as possible while
34 >> > coalescing free space into big chunks which in turn relaxes
35 >> > pressure on the FTL and allows to have more free and continuous
36 >> > erase blocks which reduces early flash chip wear. A filled SSD with
37 >> > long usage history can certainly gain back some performance from
38 >> > this.
39 >>
40 >> How does it improve performance? It seems to me that, for practical
41 >> use, almost all of the better performance with SSDs is due to reduced
42 >> latency. And IIUC, it doesn't matter for the latency where data is
43 >> stored on an SSD. If its performance degrades over time when data is
44 >> written to it, the SSD sucks, and the manufacturer should have done a
45 >> better job. Why else would I buy an SSD. If it needs to reorganise
46 >> the data stored on it, the firmware should do that.
47 >
48 > There are different factors which have impact on performance, not just
49 > seek times (which, as you write, is the worst performance breaker):
50 >
51 > * management overhead: the OS has to do more house keeping, which
52 > (a) introduces more IOPS (which is the only relevant limiting
53 > factor for SSD) and (b) introduces more CPU cycles and data
54 > structure locking within the OS routines during performing IO which
55 > comes down to more CPU cycles spend during IO
56
57 How would that be reduced by defragmenting an SSD?
58
59 > * erasing a block is where SSDs really suck at performance wise, plus
60 > blocks are essentially read-only once written - that's how flash
61 > works, a flash data block needs to be erased prior to being
62 > rewritten - and that is (compared to the rest of its performance) a
63 > really REALLY HUGE time factor
64
65 So let the SSD do it when it's idle. For applications in which it isn't
66 idle enough, an SSD won't be the best solution.
67
68 > * erase blocks are huge compared to common filesystem block sizes
69 > (erase block = 1 or 2 MB vs. file system block being 4-64k usually)
70 > which happens to result in this effect:
71 >
72 > - OS replaces a file by writing a new, deleting the old
73 > (common during updates), or the user deletes files
74 > - OS marks some blocks as free in its FS structures, it depends on
75 > the file size and its fragmentation if this gives you a
76 > continuous area of free blocks or many small blocks scattered
77 > across the disk: it results in free space fragmentation
78 > - free space fragments happen to become small over time, much
79 > smaller then the erase block size
80 > - if your system has TRIM/discard support it will tell the SSD
81 > firmware: here, I no longer use those 4k blocks
82 > - as you already figured out: those small blocks marked as free do
83 > not properly align with the erase block size - so actually, you
84 > may end up with a lot of free space but essentially no complete
85 > erase block is marked as free
86
87 Use smaller erase blocks.
88
89 > - this situation means: the SSD firmware cannot reclaim this free
90 > space to do "free block erasure" in advance so if you write
91 > another block of small data you may end up with the SSD going
92 > into a direct "read/modify/erase/write" cycle instead of just
93 > "read/modify/write" and deferring the erasing until later - ah
94 > yes, that's probably becoming slow then
95 > - what do we learn: (a) defragment free space from time to time,
96 > (b) enable TRIM/discard to reclaim blocks in advance, (c) you may
97 > want to over-provision your SSD: just don't ever use 10-15% of
98 > your SSD, trim that space, and leave it there for the firmware to
99 > shuffle erase blocks around
100
101 Use better firmware for SSDs.
102
103 > - the latter point also increases life-time for obvious reasons as
104 > SSDs only support a limited count of write-cycles per block
105 > - this "shuffling around" blocks is called wear-levelling: the
106 > firmware chooses a block candidate with the least write cycles
107 > for doing "read/modify/write"
108 >
109 > So, SSDs actually do this "reorganization" as you call it - but they
110 > are limited to it within the bounds of erase block sizes - and the
111 > firmware knows nothing about the on-disk format and its smaller blocks,
112 > so it can do nothing to go down to a finer grained reorganization.
113
114 Well, I can't help it. I'm going to need to use 2 SSDs on a hardware
115 RAID controller in a RAID-1. I expect the SSDs to just work fine. If
116 they don't, then there isn't much point in spending the extra money on
117 them.
118
119 The system needs to boot from them. So what choice do I have to make
120 these SSDs happy?
121
122 > These facts are apparently unknown to most people, that's why they are
123 > denying a SSD could become slow or needs some specialized form of
124 > "defragmentation". The usual recommendation is to do a "secure erase"
125 > of the disk if it becomes slow - which I consider pretty harmful as it
126 > rewrites ALL blocks (reducing their write-cycle counter/lifetime), plus
127 > it's time consuming and could be avoided.
128
129 That isn't an option because it would be way too much hassle.
130
131 > BTW: OS makers (and FS designers) actually optimize their systems for
132 > that kind of reorganization of the SSD firmware. NTFS may use different
133 > allocation strategies on SSD (just a guess) and in Linux there is F2FS
134 > which actually exploits this reorganization for increased performance
135 > and lifetime, Ext4 and Btrfs use different allocation strategies and
136 > prefer spreading file data instead of free space (which is just the
137 > opposite of what's done for HDD). So, with a modern OS you are much
138 > less prone to the effects described above.
139
140 Does F2FS come with some sort of redundancy? Reliability and booting
141 from these SSDs are requirements, so I can't really use btrfs because
142 it's troublesome to boot from, and the reliability is questionable. Ext4
143 doesn't have raid. Using ext4 on mdadm probably won't be any better
144 than using the hardware RAID, so there's no point in doing that, and I
145 rather spare me the overhead.
146
147 After your explanation, I have to wonder even more than before what the
148 point in using SSDs is, considering current hard- and software which
149 doesn't properly use them. OTOH, so far they do seem to provide better
150 performance than hard disks even when not used with all the special
151 precautions I don't want to have to think about.
152
153 BTW, why would anyone use SSDs for ZFS's zil or l2arc? Does ZFS treat
154 SSDs properly in this application?

Replies

Subject Author
[gentoo-user] Re: {OT} Allow work from home? Kai Krakow <hurikhan77@×××××.com>