Gentoo Archives: gentoo-user

From: Bill Kenworthy <billk@×××××××××.au>
To: gentoo-user@l.g.o
Subject: Re: [gentoo-user] backing up a partition
Date: Sat, 25 Aug 2018 00:01:29
Message-Id: 342dd3b2-04c8-647f-0a8a-cdd6dca77264@iinet.net.au
In Reply to: Re: [gentoo-user] backing up a partition by Rich Freeman
1 On 24/08/18 20:53, Rich Freeman wrote:
2 > On Fri, Aug 24, 2018 at 6:09 AM Mick <michaelkintzios@×××××.com> wrote:
3 >> However, you may prefer to use clonezilla instead of dd. The dd command will
4 >> copy each and every bit and byte of the partition whether it has data on it or
5 >> not. It is not particularly efficient. Clonezilla will perform better at
6 >> this task.
7 >>
8 >> Personally, I would only keep a back up of the filesystem contents with e.g.
9 >> rsync, and reformat the partition and restore its contents in the case of a
10 >> disaster recovery scenario.
11 > Just to summarize the sorts of options you have:
12 >
13 > dd = bit level copy. Output is the size of the partition, period,
14 > though you could compress the output by piping it into a compression
15 > utility/etc. Restored partition is identical to original, including
16 > unallocated space, file fragmentation, etc.
17 >
18 > clonezilla/partimage/etc = sparse bit level copy. Output is the size
19 > of all blocks that contain useful data, and can be further compressed.
20 > Restored partition will contain zeros in the place of free space, but
21 > will still preserve file fragmentation, special filesystem features,
22 > etc. Basically these tools operate like dd at a block level, but they
23 > first identify which blocks are used/unused. Savings is minimal for a
24 > full filesystem, and substantial for a near-empty one. These tools
25 > will fall back to dd if they can't identify free space, and can
26 > support a wider variety of filesystems quickly because they don't have
27 > to be able to mount/read the filesystem, just figure out which blocks
28 > matter. I'll also note that with clonezilla you get a fairly nice
29 > all-in-one bootable image that can store these images remotely via
30 > ssh/samba/etc, which makes restoring images onto bare metal very easy.
31 >
32 > tar/rsync/etc = file level copy. Output is the logical size of all
33 > the files on the filesystem. Restore partition will only contain file
34 > contents - details like fragmentation, trailing unused space in
35 > blocks, unused space in general, or many filesystem-specific features
36 > like snapshots/etc will NOT be preserved. On the other hand it is
37 > trivial to restore this data to any filesystem of any type of any
38 > sufficient size. The other solutions make resizing or changing
39 > filesystems more-or-less impossible unless you can mount the image
40 > files and then do a subsequent file-level copy (which is no different
41 > than doing a file level copy in the first place).
42 >
43 > I'd toss in one other general category:
44 >
45 > dump/send/etc - filesystem-specific serializing tools. The tools are
46 > specific to the filesystem, so you can't just point them at a whole
47 > hard drive with varying partition types like you can with clonezilla.
48 > They may or may not reproduce details like fragmentation, but they
49 > will efficiently store the actual data and will reproduce all
50 > filesystem-specific features (snapshots, special attributes, etc).
51 > They may also contain features that make them more efficient
52 > (especially for incremental backups) because they can use an algorithm
53 > suited for the low-level data structures employed by the filesystem,
54 > instead of doing scanning at the file/directory level. For example,
55 > it could just read all the metadata on the disk sequentially as it is
56 > physically stored on the disk, instead of traversing it from root down
57 > to leaf in the directory hierarchy which could result in lots of
58 > seeks. Filesystems like btrfs/zfs have data structures that make it
59 > VERY efficient to compare two related snapshots and find just the
60 > differences between them, including differences of one block in the
61 > middle of a large file without having to read the whole file.
62 > Restoration usually is flexible with regard to filesystem size, but
63 > not type. That is, if you have a 100GB filesystem with 20GB of data,
64 > you could restore it to a 30GB filesystem of the same type, but not
65 > one of a different type as with tar.
66 >
67 > The best solution for you obviously depends on your needs. I try to
68 > go with the last category in general as it is far more efficient.
69 > But, clonezilla is my general tool for replicating whole systems/etc
70 > since it does that so well and works with anything. For partial
71 > backups of high-value data I use duplicity, which is file-level (and
72 > supports various cloud/etc options for storage).
73 >
74 and another category - do a proper backup using backup software. 
75 Copying the raw partition has some disadvantages - difficult (but not
76 impossible) if you want to restore to new hardware (on failure), harder
77 to restore if you decide to change the underlying filesystem, takes far
78 longer to copy than a backup but the biggest problem is no versioning. 
79 Raw copies are great if you want to do an immediate restore, but hard
80 work or useless after a few days of changes.  Think of it this way - in
81 almost all cases its the data that's important, not whats holding the data.
82
83
84 BillK