Gentoo Archives: gentoo-dev

From: Richard Yao <ryao@g.o>
To: Yuxuan Shui <yshuiv7@×××××.com>
Cc: "gentoo-dev@l.g.o" <gentoo-dev@l.g.o>
Subject: [gentoo-dev] Re: GSoC proposal: cp --reflink support for zfs.
Date: Wed, 12 Mar 2014 13:19:05
Message-Id: 4814E9FB-B0EB-4F7F-B46A-EADA17AAE046@gentoo.org
In Reply to: [gentoo-dev] GSoC proposal: cp --reflink support for zfs. by Yuxuan Shui
1 A key feature of reflinks is that they operate on any data in a mountpoint, but what you described only applies to data with a deduplication table entry. In such cases, it do not see what it accomplishes over simply using data deduplication. In specific, there is no efficiency advantage. It is not clear to me that trying to cut corners to obtain one is possible without incurring consistency issues with the deduplication table falling out of sync. In the case that there is no deduplication table entry, you would need to rewrite the file. reflinks are intended to be as done as quickly as hard links, so this would seem to negate the benefit.
2
3 Matthew Ahrens and I have discussed reflinks in the past and we both agree that they would be non-trivial to implement. That does not mean it cannot be done, but I do not think this particular approach would succeed. However, I encourage you to keep thinking about such things. If you think of a way of doing this that seems workable, it would likely be something that could be made into a GSoC project.
4
5 With that said, if you want to do a ZoL-related project for Gentoo, I have some other things that I could suggest that I believe are workable. Such ideas are things that I was asked to prepare on extremely short notice and I have not yet had time to publish them. Let me know if you would be interested.
6
7 On Mar 12, 2014, at 3:15 AM, Yuxuan Shui <yshuiv7@×××××.com> wrote:
8
9 > Hi,
10 >
11 > I would like to implement cp --reflink support for ZFSOnLinux as my GSoC project.
12 >
13 > cp --reflink is used to create a COW copy of a file, so the file will not take any disk space if it's not modified. This feature is very useful for cases like storing a lot of almost identical virtual machine images. Also this is a frequently requested feature for ZoL. [1][2][3]
14 >
15 > Currently only btrfs support this feature, so my goal it to bring it to ZoL as well.
16 >
17 > I think the only way to do it (without changing too many parts of ZoL) is to use the deduplication feature of zfs. A COW copy could be done by create a new entry in ddt for the old file, and create a new file which points to the ddt entry.
18 >
19 > Please let me know if this proposal makes sense, and if that's the right way to do it.
20 >
21 > Thanks.
22 >
23 > [1]: https://groups.google.com/a/zfsonlinux.org/forum/#!topic/zfs-discuss/mvGB7QEpt3w
24 > [2]: https://github.com/zfsonlinux/zfs/issues/405
25 > [3]: https://github.com/zfsonlinux/zfs/issues/1063
26 > --
27 >
28 > Regards
29 > Yuxuan Shui