1 |
jos houtman wrote: |
2 |
> - The collection is saved on a 9TB system. |
3 |
> - The backups are two off-site 4TB systems, the collections needs to be |
4 |
> split over these. |
5 |
Now what kind of systems are these? Home-grown arrays or "real" ones? In |
6 |
the latter case, are there no vendor-provided approaches to this? |
7 |
|
8 |
I'm not sure how this would apply to regular filesystems (no idea which |
9 |
one you use though), but in "larger" (not size-wise) systems, a bitmap |
10 |
of the filesystem is kept in a separate location separate, and disk |
11 |
areas with changed or added files are marked as dirty, and transferred |
12 |
to the remote host either immediately (with synchronous i/o), as soon as |
13 |
possible (async i/o), or when requested (veeeery async i/o ;)). This is |
14 |
rather effective system, with the backup speed mainly dependent on the |
15 |
size you would choose for the bitmap (large bitmap => smaller blocks => |
16 |
potentially less data) and transfer speed.. Restructuring of data on the |
17 |
physical disk would also create a major update of blocks to be transferred. |
18 |
|
19 |
I suppose that that approach on a standard linux filesystem would |
20 |
require some extensive hacking of the fs-code, which probably isn't the |
21 |
first route to try. |
22 |
|
23 |
> - Our backup-window is the whole day as long as this does not provide a |
24 |
> performance drain. Reality is that we need to use the quiet night hours |
25 |
> 0 to 8. |
26 |
> - The collection is stored in a set of subdirectories each containing |
27 |
> 50.000 files. (1-50000,50001-100000, etc). There are ~300 subdirs in use |
28 |
> now. |
29 |
|
30 |
Marking folders as dirty is another solution, however 50k files is a bit |
31 |
big. Implementing dirty files in chunks of say 50 or 100 would be a |
32 |
half-way solution, but that'd be dependant on the application [see below]. |
33 |
|
34 |
> Only problem is constructing the list and capturing the knowledge while |
35 |
> it is available, two options exist: |
36 |
> At system level this can be done using for example I-notify, this |
37 |
> requires a user-daemon. If the daemon crashes changes will be missed |
38 |
> though. |
39 |
> At application (the one making the changes) level this can also be done, |
40 |
> when the application crashes no changes are made, so nothing is missed. |
41 |
> But it does require making the backup dependent on the application. Not |
42 |
> an ideal situation. |
43 |
Sure, it's not ideal, but as you put it yourself "when the application |
44 |
crashes no changes are made", so there's no real loss in that case. |
45 |
Provided of course that nobody accidentally comments the wrong lines of |
46 |
code ;) |
47 |
|
48 |
|
49 |
Not sure if this is of any help to you, I've mainly been involved with |
50 |
these kinds of setups with hardware solutions, so I'm a loss as to how |
51 |
they relate to a software approach to it. And I'm lacking caffeine ;) |
52 |
|
53 |
/Björn |
54 |
-- |
55 |
gentoo-server@g.o mailing list |