Gentoo Archives: gentoo-user

From: Rich Freeman <rich0@g.o>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] LVM and moving things around
Date: Sun, 03 Apr 2022 15:22:10
Message-Id: CAGfcS_no9O6bv=yMDnDY6dayuo_8yuFBXdYVRExjBhGkSRyiLw@mail.gmail.com
In Reply to: Re: [gentoo-user] LVM and moving things around by Wols Lists
1 On Sun, Apr 3, 2022 at 4:59 AM Wols Lists <antlists@××××××××××××.uk> wrote:
2 >
3 > On 03/04/2022 02:15, Bill Kenworthy wrote:
4 > > Rsync has a bwlimit argument which helps here. Note that rsync copies
5 > > the whole file on what it considers local storage (which can be mounted
6 > > network shares) ... this can cause a real slowdown.
7 >
8 > It won't help on the initial copy, but look at the - I think it is -
9 > --in-place option.
10 >
11 > It won't help with the "read and compare", but it only writes what has
12 > changed, so if a big file has changed slightly, it'll stop it re-copying
13 > the whole file.
14
15 You might also try ionice - though I find that is hit and miss for
16 effectiveness once you start adding layers like lvm/mdadm/etc as I
17 don't know that the kernel actually sees all the downstream queues
18 when it is throttling processes. I haven't used it on LVM in a while
19 though.
20
21 Replication performance (especially if you want to do a second pass
22 with rsync) is the sort of thing that using pvmove/etc helps with -
23 since it will ensure nothing gets moved. Snapshot-supporting
24 filesystems like zfs/btrfs are also better if you want to sync things
25 up because they can rapidly identify all the changes between two
26 snapshots without having to actually read anything but metadata,
27 assuming you manage things correctly and maintain a common baseline
28 between them.
29
30 Of course all of those options require that they be set up in advance.
31 If you just have two generic filesystems and want to sync them, then
32 rsync is your main option.
33
34 Oh, one thing I would suggest is that if they're on different hosts
35 you actually run rsyncd or do the sync over ssh so that rsync
36 recognizes the situation and will run the client on the remote host,
37 so that all the hashing/etc is run local to the drives. This greatly
38 reduces your network traffic which is likely to be the bottleneck.
39 All the same, if you want to actually use hashes to find differences
40 and not just rely on size/mtime there is no getting around having to
41 read all the data off the disk.
42
43 --
44 Rich