Gentoo Archives: gentoo-user

From: John Covici <covici@××××××××××.com>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] Backup program that compresses data but only changes new files.
Date: Mon, 15 Aug 2022 07:02:35
Message-Id: m35yiuhtvo.wl-covici@ccs.covici.com
In Reply to: Re: [gentoo-user] Backup program that compresses data but only changes new files. by Rich Freeman
1 On Sun, 14 Aug 2022 19:03:25 -0400,
2 Rich Freeman wrote:
3 >
4 > On Sun, Aug 14, 2022 at 6:44 PM Dale <rdalek1967@×××××.com> wrote:
5 > >
6 > > Right now, I'm using rsync which doesn't compress files but does just
7 > > update things that have changed. I'd like to find some way, software
8 > > but maybe there is already a tool I'm unaware of, to compress data and
9 > > work a lot like rsync otherwise.
10 >
11 > So, how important is it that it work exactly like rsync?
12 >
13 > I use duplicity, in part because I've been using it forever. Restic
14 > seems to be a similar program most are using these days which I
15 > haven't looked at super-closely but I'd look at that first if starting
16 > out.
17 >
18 > Duplicity uses librsync, so it backs up exactly the same data as rsync
19 > would, except instead of replicating entire files, it creates streams
20 > of data more like something like tar. So if you back up a million
21 > small files you might get out 1-3 big files. It can compress and
22 > encrypt the data as you wish. The downside is that you don't end up
23 > with something that looks like your original files - you have to run
24 > the restore process to extract them all back out. It is extremely
25 > space-efficient though - if 1 byte changes in the middle of a 10GB
26 > file you'll end up just backing up maybe a kilobyte or so (whatever
27 > the block size is), which is just like rsync.
28 >
29 > Typically you rely on metadata to find files that change which is
30 > fast, but I'm guessing you can tell these programs to do a deep scan
31 > which of course requires reading the entire contents, and that will
32 > discover anything that was modified without changing ctime/mtime.
33 >
34 > The output files can be split to any size, and the index info (the
35 > metadata) is separate from the raw data. If you're storing to
36 > offline/remote/cloud/whatever storage typically you keep the metadata
37 > cached locally to speed retrieval and to figure out what files have
38 > changed for incrementals. However, if the local cache isn't there
39 > then it will fetch just the indexes from wherever it is stored
40 > (they're small).
41 >
42 > It has support for many cloud services - I store mine to AWS S3.
43 >
44 > There are also some options that are a little closer to rsync like
45 > rsnapshot and burp. Those don't store compressed (unless there is an
46 > option for that or something), but they do let you rotate through
47 > multiple backups and they'll set up hard links/etc so that they are
48 > de-duplicated. Of course hard links are at the file level so if 1
49 > byte inside a file changes you'll end up with two full copies. It
50 > will still only transfer a single block so the bandwidth requirements
51 > are similar to rsync.
52
53 I have been using restic for a while, and although it does not do
54 compression, there are a couple of nice things it does -- if a file is
55 in more than one location, or if you rename the file, its smart enough
56 not to backup any data at all, just the metadata. Also, you never
57 have to delete the whole backup and start over like you have to do
58 with duplicity, you can just delete backups older than a certain
59 number of days and you are good to go. Its in go, so building can be
60 a pain and I don't like programs which download gobs of stuff from the
61 internet to build, but it seems to work quite well.
62
63 --
64 Your life is like a penny. You're going to lose it. The question is:
65 How do
66 you spend it?
67
68 John Covici wb2una
69 covici@××××××××××.com

Replies

Subject Author
Re: [gentoo-user] Backup program that compresses data but only changes new files. "Gerrit Kühn" <gerrit.kuehn@×××××××.de>