From: Rich Freeman <rich0@gentoo.org>
To: gentoo-user@lists.gentoo.org
Subject: Re: [gentoo-user] backup horror story (happy ending)
Date: Sat, 9 Nov 2024 09:43:43 -0500 [thread overview]
Message-ID: <CAGfcS_=GjA1nk3L5OKa14Xfy_p6vEt4f7M=T8_kE-VxEgUHG_Q@mail.gmail.com> (raw)
In-Reply-To: <713323a1-bd58-4607-898f-db2aedc9942c@yahoo.com>
On Sat, Nov 9, 2024 at 5:30 AM ralfconn <mentadent47@yahoo.com> wrote:
>
> But I have a backup, no problem... till I realize the cron job had
> already run so it had overwritten the old files with the new, corrupt
> versions.
>
I highly recommend having multiple backups to avoid this sort of problem.
If you aren't wedded to rsync, then restic seems to be the platform of
choice these days. Not sure offhand if it handles ntfs very well. I
use duplicati for backing up windows hosts to an S3 backend and that
works great, but that is more of a windows solution (VSS and so on).
I imagine that restic doesn't care much about the filesystem if it is
running on linux and everything is mounted.
If you are wedded to the rsync approach where your backups are just a
big directory tree that you can easily access, then I suggest using
rsnapshot. It is basically a wrapper around rsync that maintains a
backup history, in a very clean way. Basically it does a hard-link
copy of your entire backup set to a new directory tree named by the
timestamp, and then it runs rsync to sync that new tree. The result
is that you get file-level deduplication effectively, and otherwise
get what looks like a nice big full copy of the backup source in each
directory. It probably won't be as efficient as something like restic
since I'm guessing that can do deduplication below the file level (I
think it can also deduplicate across multiple hosts/etc if you're
using it that way). Most of these modern tools still use librsync
under the hood so the actual data transfer is just as efficient.
rsync by itself is nice for its simplicity, but it just isn't a very
elegant backup solution. You can tell it to preserve old file
versions, but those end up stored next to the original files with
different filenames and that can be a real mess to restore if you
don't want to end up with all those old versions. With
restic/rsnapshot you can go back in time but still get a clean
restore, and you can still extract individual files from various
snapshots.
If you really are running rsync at any kind of scale also consider
rclone, which is often faster since it can transfer multiple files in
parallel, which is useful if you're more bound by latency than disk
IOPS (often the case on solid state drives over networks).
--
Rich
next prev parent reply other threads:[~2024-11-09 14:43 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <713323a1-bd58-4607-898f-db2aedc9942c.ref@yahoo.com>
2024-11-09 10:30 ` [gentoo-user] backup horror story (happy ending) ralfconn
2024-11-09 14:43 ` Rich Freeman [this message]
2024-11-09 21:01 ` Philip Webb
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAGfcS_=GjA1nk3L5OKa14Xfy_p6vEt4f7M=T8_kE-VxEgUHG_Q@mail.gmail.com' \
--to=rich0@gentoo.org \
--cc=gentoo-user@lists.gentoo.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox