1 |
Hi! |
2 |
|
3 |
On Wed, Apr 19, 2006 at 02:08:56PM +0200, jos houtman wrote: |
4 |
> current situation: |
5 |
> - The collection is stored in a set of subdirectories each containing |
6 |
> 50.000 files. (1-50000,50001-100000, etc). There are ~300 subdirs in use |
7 |
> now. |
8 |
> - Files are never deleted. |
9 |
> - In the future it can happen that files change. my exception is that |
10 |
> atmost a few thousand files a day will change, scattered over the whole |
11 |
> collection with an emphasis on the most recent files. |
12 |
[cut] |
13 |
> Only problem is constructing the list and capturing the knowledge while |
14 |
> it is available, two options exist: |
15 |
> At system level this can be done using for example I-notify, this |
16 |
> requires a user-daemon. If the daemon crashes changes will be missed |
17 |
> though. |
18 |
> At application (the one making the changes) level this can also be done, |
19 |
> when the application crashes no changes are made, so nothing is missed. |
20 |
> But it does require making the backup dependent on the application. Not |
21 |
> an ideal situation. |
22 |
|
23 |
At first, this issue isn't Gentoo-specific, so it should at least be |
24 |
marked [OT] in subject, I think. ;-) |
25 |
|
26 |
My experience in complex backups says: it's nearly impossible to make |
27 |
effective (fast and reliable) backup for some complex application without |
28 |
writing that application with backup feature in mind. |
29 |
|
30 |
In your case that mean, for example: it's probably best solution to |
31 |
backup issue to change a way how files changed so what changed files |
32 |
isn't really CHANGED, but instead new version is just ADDED to collection. |
33 |
This way it will be enough for you to just remember which file was |
34 |
backuped last by previous backup and on next backup continue from that |
35 |
file (I suppose all your files are numbered: "(1-50000,50001-100000, etc)"). |
36 |
|
37 |
This way backup will not depend on collection size (only on amount of |
38 |
added files) and will not depend on some "special feature" in application |
39 |
(like constructing list of changed files) which may have bugs. |
40 |
|
41 |
In case if your application need newer version of file has same name |
42 |
as previous version and this behaviour can't be changed, then you can |
43 |
consider some special solutions like: after ADDING newer version to |
44 |
collection replace previous version by symlink to newer version. To |
45 |
backup these symlinks you will need additional step like: |
46 |
find /collection -type l -print0 | xargs -0 tar ... |
47 |
I've no idea is what "find -type l" will be fast enough for you, but I |
48 |
suppose it will be much much much faster than rsync, just because it |
49 |
don't need to read all files in collection and calculate their checksums. |
50 |
|
51 |
-- |
52 |
WBR, Alex. |
53 |
-- |
54 |
gentoo-server@g.o mailing list |