Gentoo Archives: gentoo-user

From:	Dan Douglas <ormaaj@×××××.com>
To:	gentoo-user@l.g.o
Subject:	Re: [gentoo-user] Re: [offtopic] Copy-On-Write ?
Date:	Sun, 17 Sep 2017 13:21:21
Message-Id:	`cc08939b-e9b4-2f86-5191-21dcacb880ea@gmail.com`
In Reply to:	[gentoo-user] Re: [offtopic] Copy-On-Write ? by Kai Krakow

1	On 09/17/2017 04:17 AM, Kai Krakow wrote:
2	> Am Sun, 17 Sep 2017 01:20:45 -0500
3	> schrieb Dan Douglas <ormaaj@×××××.com>:
4	>
5	>> On 09/16/2017 07:06 AM, Kai Krakow wrote:
6	>>> Am Fri, 15 Sep 2017 14:28:49 -0400
7	>>> schrieb Rich Freeman <rich0@g.o>:
8	>>>
9	>>>> On Fri, Sep 8, 2017 at 3:16 PM, Kai Krakow <hurikhan77@×××××.com>
10	>>>> wrote:
11	>> [...]
12	>>>>
13	>>>> True, but keep in mind that this applies in general in btrfs to any
14	>>>> kind of modification to a file. If you modify 1MB in the middle
15	>>>> of a 10GB file on ext4 you end up it taking up 10GB of space. If
16	>>>> you do the same thing in btrfs you'll probably end up with the
17	>>>> file taking up 10.001GB. Since btrfs doesn't overwrite files
18	>>>> in-place it will typically allocate a new extent for the
19	>>>> additional 1MB, and the original content at that position within
20	>>>> the file is still on disk in the original extent. It works a bit
21	>>>> like a log-based filesystem in this regard (which is also
22	>>>> effectively copy on write).
23	>>>
24	>>> Good point, this makes sense. I never thought about that.
25	>>>
26	>>> But I guess that btrfs doesn't use 10G sized extents? And I also
27	>>> guess, this is where autodefrag jumps in.
28	>>
29	>> According to btrfs-filesystem(8), defragmentation breaks reflinks, in
30	>> all but a few old kernel versions where I guess they tried to fix the
31	>> problem and apparently failed.
32	>
33	> It was splitting and splicing all the reflinks which is actually a tree
34	> walk with more and more extents coming into the equation, and ended up
35	> doing a lot of small IO and needing a lot of memory. I think you really
36	> cannot fix this when working with extents.
37
38	I figured by "break up" they meant it eliminates the reflink by making
39	a full copy... so the increased space they're talking about isn't really
40	double that of the original data in other words.
41
42	>
43	>> This really makes much of what btrfs
44	>> does altogether pointless if you ever defragment manually or have
45	>> autodefrag enabled. Deduplication is broken for the same reason.
46	>
47	> It's much easier to fix this for deduplication: Just write your common
48	> denominator of an extent to a tmp file, then walk all the reflinks and
49	> share them with parts of this extent.
50	>
51	> If you carefully select what to defragment, there should be no problem.
52	> A defrag tool could simply skip all the shared extents. A few fragments
53	> do not hurt performance at all, but what's important is spatial
54	> locality. A lot small fragments may hurt performance a lot, so one
55	> could give the defragger a hint when to ignore the rule and still
56	> defragment the extent. Also, when your deduplication window is 1M you
57	> could probably safely defrag all extents smaller than 1M.
58
59	Yeah this sort of hurts with the way I deal wtih KVM image snapshots. I
60	have raw base images as backing files with lots of shared and null
61	data, so I run `fallocate --dig-holes' followed by `duperemove
62	--dedupe-options=same' on the cow-enabled base images and hope that
63	btrfs defrag can clean up the resulting fragmented mess, but it's a slow
64	process and doesn't seem to do a good job.

Attachments

File name	MIME type
signature.asc	application/pgp-signature

Replies

Subject	Author
[gentoo-user] Re: [offtopic] Copy-On-Write ?	Kai Krakow <hurikhan77@×××××.com>

Report Message

Find on MARC Find on Google Groups