Gentoo Archives: gentoo-user

From:	"J. Roeleveld" <joost@××××××××.org>
To:	gentoo-user@l.g.o
Subject:	Re: [gentoo-user] Backup program that compresses data but only changes new files.
Date:	Mon, 15 Aug 2022 18:34:42
Message-Id:	`2645925.mvXUDI8C0e@iris`
In Reply to:	Re: [gentoo-user] Backup program that compresses data but only changes new files. by Dale

1	On Monday, August 15, 2022 9:05:24 AM CEST Dale wrote:
2	> Rich Freeman wrote:
3	> > On Sun, Aug 14, 2022 at 6:44 PM Dale <rdalek1967@×××××.com> wrote:
4	> >> Right now, I'm using rsync which doesn't compress files but does just
5	> >> update things that have changed. I'd like to find some way, software
6	> >> but maybe there is already a tool I'm unaware of, to compress data and
7	> >> work a lot like rsync otherwise.
8	> >
9	> > So, how important is it that it work exactly like rsync?
10	> >
11	> > I use duplicity, in part because I've been using it forever. Restic
12	> > seems to be a similar program most are using these days which I
13	> > haven't looked at super-closely but I'd look at that first if starting
14	> > out.
15	> >
16	> > Duplicity uses librsync, so it backs up exactly the same data as rsync
17	> > would, except instead of replicating entire files, it creates streams
18	> > of data more like something like tar. So if you back up a million
19	> > small files you might get out 1-3 big files. It can compress and
20	> > encrypt the data as you wish. The downside is that you don't end up
21	> > with something that looks like your original files - you have to run
22	> > the restore process to extract them all back out. It is extremely
23	> > space-efficient though - if 1 byte changes in the middle of a 10GB
24	> > file you'll end up just backing up maybe a kilobyte or so (whatever
25	> > the block size is), which is just like rsync.
26	> >
27	> > Typically you rely on metadata to find files that change which is
28	> > fast, but I'm guessing you can tell these programs to do a deep scan
29	> > which of course requires reading the entire contents, and that will
30	> > discover anything that was modified without changing ctime/mtime.
31	> >
32	> > The output files can be split to any size, and the index info (the
33	> > metadata) is separate from the raw data. If you're storing to
34	> > offline/remote/cloud/whatever storage typically you keep the metadata
35	> > cached locally to speed retrieval and to figure out what files have
36	> > changed for incrementals. However, if the local cache isn't there
37	> > then it will fetch just the indexes from wherever it is stored
38	> > (they're small).
39	> >
40	> > It has support for many cloud services - I store mine to AWS S3.
41	> >
42	> > There are also some options that are a little closer to rsync like
43	> > rsnapshot and burp. Those don't store compressed (unless there is an
44	> > option for that or something), but they do let you rotate through
45	> > multiple backups and they'll set up hard links/etc so that they are
46	> > de-duplicated. Of course hard links are at the file level so if 1
47	> > byte inside a file changes you'll end up with two full copies. It
48	> > will still only transfer a single block so the bandwidth requirements
49	> > are similar to rsync.
50	>
51	> Duplicity sounds interesting except that I already have the drive
52	> encrypted. Keep in mind, these are external drives that I hook up long
53	> enough to complete the backups then back in a fire safe they go. The
54	> reason I mentioned being like rsync, I don't want to rebuild a backup
55	> from scratch each time as that would be time consuming. I thought of
56	> using Kbackup ages ago and it rebuilds from scratch each time but it
57	> does have the option of compressing. That might work for small stuff
58	> but not many TBs of it. Back in the early 90's, I remember using a
59	> backup software that was incremental. It would only update files that
60	> changed and would do it over several floppy disks and compressed it as
61	> well. Something like that nowadays is likely rare if it exists at all
62	> since floppies are long dead. I either need to split my backup into two
63	> pieces or compress my data. That is why I mentioned if there is a way
64	> to backup first part of alphabet in one command, switch disks and then
65	> do second part of alphabet to another disk.
66
67	Actually, there still is a piece of software that does this:
68	" app-backup/dar "
69	You can tell it to split the backups into slices of a specific size.
70
71	--
72	Joost

Replies

Subject	Author
Re: [gentoo-user] Backup program that compresses data but only changes new files.	Rich Freeman <rich0@g.o>

Report Message

Find on MARC Find on Google Groups