1 |
On 15/8/22 06:44, Dale wrote: |
2 |
> Howdy, |
3 |
> |
4 |
> With my new fiber internet, my poor disks are getting a work out, and |
5 |
> also filling up. First casualty, my backup disk. I have one directory |
6 |
> that is . . . well . . . huge. It's about 7TBs or so. This is where it |
7 |
> is right now and it's still trying to pack in files. |
8 |
> |
9 |
> |
10 |
> /dev/mapper/8tb 7.3T 7.1T 201G 98% /mnt/8tb |
11 |
> |
12 |
> |
13 |
> Right now, I'm using rsync which doesn't compress files but does just |
14 |
> update things that have changed. I'd like to find some way, software |
15 |
> but maybe there is already a tool I'm unaware of, to compress data and |
16 |
> work a lot like rsync otherwise. I looked in app-backup and there is a |
17 |
> lot of options but not sure which fits best for what I want to do. |
18 |
> Again, backup a directory, compress and only update with changed or new |
19 |
> files. Generally, it only adds files but sometimes a file gets replaced |
20 |
> as well. Same name but different size. |
21 |
> |
22 |
> I was trying to go through the list in app-backup one by one but to be |
23 |
> honest, most links included only go to github or something and usually |
24 |
> doesn't tell anything about how it works or anything. Basically, as far |
25 |
> as seeing if it does what I want, it's useless. It sort of reminds me of |
26 |
> quite a few USE flag descriptions. |
27 |
> |
28 |
> I plan to buy another hard drive pretty soon. Next month is possible. |
29 |
> If there is nothing available that does what I want, is there a way to |
30 |
> use rsync and have it set to backup files starting with "a" through "k" |
31 |
> to one spot and then backup "l" through "z" to another? I could then |
32 |
> split the files into two parts. I use a script to do this now, if one |
33 |
> could call my little things scripts, so even a complicated command could |
34 |
> work, just may need help figuring out the command. |
35 |
> |
36 |
> Thoughts? Ideas? |
37 |
> |
38 |
> Dale |
39 |
> |
40 |
> :-) :-) |
41 |
> |
42 |
The questions you need to ask is how compressible is the data and how |
43 |
much duplication is in there. Rsync's biggest disadvantage is it |
44 |
doesn't keep history, so if you need to restore something from last week |
45 |
you are SOL. Honestly, rsync is not a backup program and should only be |
46 |
used the way you do for data that don't value as an rsync archive is a |
47 |
disaster waiting to happen from a backup point of view. |
48 |
|
49 |
Look into dirvish - uses hard links to keep files current but safe, is |
50 |
easy to restore (looks like a exact copy so you cp the files back if |
51 |
needed. Downside is it hammers the hard disk and has no compression so |
52 |
its only deduplication via history (my backups stabilised about 2x |
53 |
original size for ~2yrs of history - though you can use something like |
54 |
btrfs which has filesystem level compression. |
55 |
|
56 |
My current program is borgbackup which is very sophisticated in how it |
57 |
stores data - its probably your best bet in fact. I am storing |
58 |
literally tens of Tb of raw data on a 4Tb usb3 disk (going back years |
59 |
and yes, I do restore regularly, and not just for disasters but for |
60 |
space efficient long term storage I access only rarely. |
61 |
|
62 |
e.g.: |
63 |
|
64 |
A single host: |
65 |
|
66 |
------------------------------------------------------------------------------ |
67 |
Original size Compressed size Deduplicated size |
68 |
All archives: 3.07 TB 1.96 TB |
69 |
151.80 GB |
70 |
|
71 |
Unique chunks Total chunks |
72 |
Chunk index: 1026085 22285913 |
73 |
|
74 |
|
75 |
Then there is my offline storage - it backs up ~15 hosts (in repos like |
76 |
the above) + data storage like 22 years of email etc. Each host backs up |
77 |
to its own repo then the offline storage backs that up. The |
78 |
deduplicated size is the actual on disk size ... compression varies as |
79 |
its whatever I used at the time the backup was taken ... currently I |
80 |
have it set to "auto,zstd,11" but it can be mixed in the same repo (a |
81 |
repo is a single backup set - you can nest repos which is what I do - so |
82 |
~45Tb stored on a 4Tb offline disk). One advantage of a system like |
83 |
this is chunked data rarely changes, so its only the differences that |
84 |
are backed up (read the borgbackup docs - interesting) |
85 |
|
86 |
------------------------------------------------------------------------------ |
87 |
Original size Compressed size Deduplicated size |
88 |
All archives: 28.69 TB 28.69 TB |
89 |
3.81 TB |
90 |
|
91 |
Unique chunks Total chunks |
92 |
Chunk index: |