1 |
On Sat, Jan 23, 2016 at 7:44 PM, Andrew Savchenko <bircoph@g.o> wrote: |
2 |
> |
3 |
> a) EXT4 is a good extremely robust solution. Reliability is out of |
4 |
> the questioning: on my old box with bad memory banks it kept my data |
5 |
> safe for years, almost all losses were recoverable. And it has some |
6 |
> SSD-oriented features like discard support, also stripe and stride |
7 |
> with can be aligned to erase block size to optimize erase |
8 |
> operations and reduce wear-out. |
9 |
|
10 |
I think EXT4 is the conservative solution. It is also more flexible |
11 |
than xfs, and a decent performer all around. |
12 |
|
13 |
> b) In some tests XFS is better than EXT4 ([1] slides 16-18; [2]). |
14 |
> Though I had data loss on XFS on unclear shutdown events in the |
15 |
> past. This was about 5 years ago, so XFS robustness should have |
16 |
> improved, of course, but I still remember the pain :/ |
17 |
|
18 |
Maybe it improved, but xfs probably hasn't changed much at all. It |
19 |
isn't really a focus of development as far as I'm aware. I probably |
20 |
wouldn't use it. I used to use it, but was frustrated with its |
21 |
inability to shrink and the zero-files feature. |
22 |
|
23 |
> c) F2FS looks very interesting, it has really good flash-oriented |
24 |
> design [3]. Also it seems to beat EXT4 on PCIe SSD ([3] chapter |
25 |
> 3.2.2, pages 9-10) and everything other on compile test ([2] page 5) |
26 |
> which should be close to the type of workload I'm interested in |
27 |
> (though all tests in [2] have extra raid layer). The only thing |
28 |
> that bothers me is some data loss reports for F2FS found on the |
29 |
> net, though all I found is dated back 2012-2014 and F2FS have fsck |
30 |
> tool now, thus it should be more reliable these days. |
31 |
|
32 |
So, F2FS is of course very promising on flash. It should be the most |
33 |
efficient solution in terms of even wear of your drive. I'd think |
34 |
that lots of short-term files like compiling would actually be a good |
35 |
use case for it, Since discarded files don't need to be rewritten when |
36 |
it rolls over. But, I won't argue with the benchmarks. |
37 |
|
38 |
It probably will improve as well as it matures. However, it isn't |
39 |
nearly as mature as ext4 or xfs, so from a data-integrity standpoint |
40 |
you're definitely at higher risk. If you're regularly backing up and |
41 |
don't care about a very low risk of problems, It is probably a good |
42 |
choice. |
43 |
|
44 |
> d) I'm not sure about BTRFS, since it is very sophisticated and I'm |
45 |
> not interested in its advanced features such as snapshots, |
46 |
> checksums, subvolumes and so on. In some tests [2] it tends to |
47 |
> achive better performance than ext4, but due to its sophisticated |
48 |
> nature it is bound to add more code paths and latency than other |
49 |
> solutions. |
50 |
|
51 |
The only reason to use btrfs at this point are all those advanced |
52 |
features, such as being able to copy a directory with 10 million small |
53 |
files in it in 5 seconds. I'd think that might be useful for |
54 |
development, but of course git does the same thing. Git and btrfs are |
55 |
actually somewhat similar in principle. Take git with the ability to |
56 |
erase blobs and mirror things and that is kind of like btrfs. |
57 |
|
58 |
Btrfs is also immature, and while data loss on an n-1 longterm like |
59 |
3.18 is fairly rare it does happen. It is optimized for ssd but |
60 |
generally tends to underperform the write-in-place filesystems, mainly |
61 |
because it isn't optimized, and of course it can't write-in-place. |
62 |
|
63 |
If you REALLY don't care about the data integrity features and such |
64 |
then I don't think it is the best solution for you. |
65 |
|
66 |
> P.S. Is aligning to erase block size really important for NVMe? I |
67 |
> can't find erase block size for this drive (Samsung MZ-VKV512) |
68 |
> neither in official docs nor on the net... |
69 |
|
70 |
Unless the erase blocks are a single sector in size then I'd think |
71 |
that alignment would matter. Now, for F2FS alignment probably matters |
72 |
far less than other filesystems since the only blocks on the entire |
73 |
drive that may potentially be partially erased are the ones that |
74 |
border two log regions. F2FS just writes each block in a region once, |
75 |
and then trims and entire contiguous region when it fills the previous |
76 |
region up. Large contiguous trims with individual blocks being |
77 |
written once are basically a best-case for flash, which is of course |
78 |
why it works that way. You should still ensure it is aligned, but not |
79 |
much will happen if it isn't I'd think. |
80 |
|
81 |
For something like ext4 where blocks are constantly overwritten I'd |
82 |
think that poor alignment is going to really hurt your performance. |
83 |
Btrfs might be somewhere in-between - it doesn't overwrite data in |
84 |
place, but it does write all over the disk so it would be constantly |
85 |
be hitting erase block borders if not aligned. That is just a |
86 |
hand-waving argument - I have no idea how they work in practice. |
87 |
|
88 |
-- |
89 |
Rich |