1 |
On Sat, Sep 16, 2017 at 8:06 AM, Kai Krakow <hurikhan77@×××××.com> wrote: |
2 |
> |
3 |
> But I guess that btrfs doesn't use 10G sized extents? And I also guess, |
4 |
> this is where autodefrag jumps in. |
5 |
> |
6 |
|
7 |
It definitely doesn't use 10G extents considering the chunks are only |
8 |
1GB. (For those who aren't aware, btrfs divides devices into chunks |
9 |
which basically act like individual sub-devices to which operations |
10 |
like mirroring/raid/etc are applied. This is why you can change raid |
11 |
modes on the fly - the operation takes effect on new chunks. This |
12 |
also allows clever things like a "RAID1" on 3x1TB disks to have 1.5TB |
13 |
of useful space, because the chunks essentially balance themselves |
14 |
across all three disks in pairs. It also is what causes the infamous |
15 |
issues when btrfs runs low on space - once the last chunk is allocated |
16 |
it can become difficult to rebalance/consolidate the remaining space.) |
17 |
|
18 |
I couldn't actually find any info on default extent size. I did find |
19 |
a 128MB example in the docs, so presumably that isn't an unusual size. |
20 |
So, the 1MB example would probably still work. Obviously if an entire |
21 |
extent becomes obsolete it will lose its reference count and become |
22 |
free. |
23 |
|
24 |
Defrag was definitely intended to deal with this. I haven't looked at |
25 |
the state of it in ages, when I stopped using it due to a bug and some |
26 |
limitations. The main limitation being that defrag at least used to |
27 |
be over-zealous. Not only would it free up the 1MB of wasted space, |
28 |
as in this example, but if that 1GB file had a reflink clone it would |
29 |
go ahead and split it into two duplicate 1GB extents. I believe that |
30 |
dedup would do the reverse of this. Getting both to work together |
31 |
"the right way" didn't seem possible the last time I looked into it, |
32 |
but if that has changed I'm interested. |
33 |
|
34 |
Granted, I've been moving away from btrfs lately, due to the fact that |
35 |
it just hasn't matured as I originally thought it would. I really |
36 |
love features like reflinks, but it has been years since it was |
37 |
"almost ready" and it still tends to eat data. For the moment I'm |
38 |
relying more on zfs. I'd love to switch back if they ever pull things |
39 |
together. The other filesystem I'm eyeing with interest is cephfs, |
40 |
but that still is slightly immature (on-disk checksums were only just |
41 |
added), and it has a bit of overhead until you get into fairly large |
42 |
arrays. Cheap arm-based OSD options seem to be fairly RAM-starved at |
43 |
the moment as well given the ceph recommendation of 1GB/TB. arm64 |
44 |
still seems to be slow to catch on, let alone cheap boards with 4-16GB |
45 |
of RAM. |
46 |
|
47 |
-- |
48 |
Rich |