1 |
Hi, |
2 |
|
3 |
I plan to use NVMe SSD on my desktop and I'm quite puzzled with |
4 |
the filesystem choice :/ So community input on this matter will be |
5 |
very valuable. |
6 |
|
7 |
Typical anticipated workload: root filesystem, a lot of small and |
8 |
middle sized files (e.g. source code), tons of compiling, ccache, |
9 |
testing and similar dev activity. Large and media files will be |
10 |
stored on another dedicated host. |
11 |
|
12 |
What I want (just random order, so last is not the least): |
13 |
|
14 |
1. Reasonable reliability. I'll have regular backups on an external |
15 |
media, but I don't want to have my root corrupted often. |
16 |
|
17 |
2. Minimized media wear-out. Filesystem should be friendly to NVMe: |
18 |
for 512GB size it have only 400 TBW warranty :/ |
19 |
|
20 |
3. Performance. This is natural to strive to get full speed and |
21 |
minimal latency from such a yummy storage. |
22 |
|
23 |
For now I consider the following solutions: |
24 |
a) EXT4 |
25 |
b) XFS |
26 |
c) F2FS |
27 |
d) BTRFS |
28 |
|
29 |
a) EXT4 is a good extremely robust solution. Reliability is out of |
30 |
the questioning: on my old box with bad memory banks it kept my data |
31 |
safe for years, almost all losses were recoverable. And it has some |
32 |
SSD-oriented features like discard support, also stripe and stride |
33 |
with can be aligned to erase block size to optimize erase |
34 |
operations and reduce wear-out. |
35 |
|
36 |
b) In some tests XFS is better than EXT4 ([1] slides 16-18; [2]). |
37 |
Though I had data loss on XFS on unclear shutdown events in the |
38 |
past. This was about 5 years ago, so XFS robustness should have |
39 |
improved, of course, but I still remember the pain :/ |
40 |
|
41 |
c) F2FS looks very interesting, it has really good flash-oriented |
42 |
design [3]. Also it seems to beat EXT4 on PCIe SSD ([3] chapter |
43 |
3.2.2, pages 9-10) and everything other on compile test ([2] page 5) |
44 |
which should be close to the type of workload I'm interested in |
45 |
(though all tests in [2] have extra raid layer). The only thing |
46 |
that bothers me is some data loss reports for F2FS found on the |
47 |
net, though all I found is dated back 2012-2014 and F2FS have fsck |
48 |
tool now, thus it should be more reliable these days. |
49 |
|
50 |
d) I'm not sure about BTRFS, since it is very sophisticated and I'm |
51 |
not interested in its advanced features such as snapshots, |
52 |
checksums, subvolumes and so on. In some tests [2] it tends to |
53 |
achive better performance than ext4, but due to its sophisticated |
54 |
nature it is bound to add more code paths and latency than other |
55 |
solutions. |
56 |
|
57 |
|
58 |
So, for now I tend to use F2FS for / and test other filesystems for |
59 |
some load (e.g. compile chromium or libreoffice there). It will be |
60 |
hard to do good tests, though, because the drive has built-in DDR3 |
61 |
512 MB memory cache which will affect results and I have now idea |
62 |
how to flash it other than reboot the host. |
63 |
|
64 |
But all these considerations are based on theory and tests found on |
65 |
the net. I have little practical experience with SSDs other than USB |
66 |
sticks and SD cards, so any feedback and practical experience share |
67 |
is appreciated. |
68 |
|
69 |
P.S. Is aligning to erase block size really important for NVMe? I |
70 |
can't find erase block size for this drive (Samsung MZ-VKV512) |
71 |
neither in official docs nor on the net... |
72 |
|
73 |
[1] https://videos.cdn.redhat.com/summit2015/presentations/17856_getting-the-most-out-of-your-nvme-ssd.pdf |
74 |
[2] https://www.phoronix.com/scan.php?page=article&item=linux_raid_fs4 |
75 |
[3] https://www.usenix.org/system/files/conference/fast15/fast15-paper-lee.pdf |
76 |
|
77 |
Best regards, |
78 |
Andrew Savchenko |