Gentoo Archives: gentoo-user

From:	Rich Freeman <rich0@g.o>
To:	gentoo-user@l.g.o
Subject:	Re: [gentoo-user] Backup program that compresses data but only changes new files.
Date:	Sun, 14 Aug 2022 23:03:43
Message-Id:	`CAGfcS_=ZNidD3GX-dH4zUVXQFxazON5JwSOSg_y1h2uMy-=r9g@mail.gmail.com`
In Reply to:	[gentoo-user] Backup program that compresses data but only changes new files. by Dale

1	On Sun, Aug 14, 2022 at 6:44 PM Dale <rdalek1967@×××××.com> wrote:
2	>
3	> Right now, I'm using rsync which doesn't compress files but does just
4	> update things that have changed. I'd like to find some way, software
5	> but maybe there is already a tool I'm unaware of, to compress data and
6	> work a lot like rsync otherwise.
7
8	So, how important is it that it work exactly like rsync?
9
10	I use duplicity, in part because I've been using it forever. Restic
11	seems to be a similar program most are using these days which I
12	haven't looked at super-closely but I'd look at that first if starting
13	out.
14
15	Duplicity uses librsync, so it backs up exactly the same data as rsync
16	would, except instead of replicating entire files, it creates streams
17	of data more like something like tar. So if you back up a million
18	small files you might get out 1-3 big files. It can compress and
19	encrypt the data as you wish. The downside is that you don't end up
20	with something that looks like your original files - you have to run
21	the restore process to extract them all back out. It is extremely
22	space-efficient though - if 1 byte changes in the middle of a 10GB
23	file you'll end up just backing up maybe a kilobyte or so (whatever
24	the block size is), which is just like rsync.
25
26	Typically you rely on metadata to find files that change which is
27	fast, but I'm guessing you can tell these programs to do a deep scan
28	which of course requires reading the entire contents, and that will
29	discover anything that was modified without changing ctime/mtime.
30
31	The output files can be split to any size, and the index info (the
32	metadata) is separate from the raw data. If you're storing to
33	offline/remote/cloud/whatever storage typically you keep the metadata
34	cached locally to speed retrieval and to figure out what files have
35	changed for incrementals. However, if the local cache isn't there
36	then it will fetch just the indexes from wherever it is stored
37	(they're small).
38
39	It has support for many cloud services - I store mine to AWS S3.
40
41	There are also some options that are a little closer to rsync like
42	rsnapshot and burp. Those don't store compressed (unless there is an
43	option for that or something), but they do let you rotate through
44	multiple backups and they'll set up hard links/etc so that they are
45	de-duplicated. Of course hard links are at the file level so if 1
46	byte inside a file changes you'll end up with two full copies. It
47	will still only transfer a single block so the bandwidth requirements
48	are similar to rsync.
49
50	--
51	Rich

Replies

Subject	Author
Re: [gentoo-user] Backup program that compresses data but only changes new files.	John Covici <covici@××××××××××.com>
Re: [gentoo-user] Backup program that compresses data but only changes new files.	Dale <rdalek1967@×××××.com>

Report Message

Find on MARC Find on Google Groups