Gentoo Archives: gentoo-user

From: Simon <turner25@×××××.com>
To: gentoo-user@l.g.o
Subject: [gentoo-user] File synchronisation utility (searching for/about to program it)
Date: Wed, 22 Jul 2009 15:09:26
Message-Id: 5f14cf5e0907220809ud14a99dq81950fba1c45b495@mail.gmail.com
1 Hi there! I was about to jump into the programming of my own sync
2 utility when i thought: Maybe i should ask if it exists first! Also,
3 this is not really gentoo-related: it doesnt deal with OS or
4 portage... but i'm rather asking the venerable community at large,
5 excuse me if you find this post inappropriate (but can you suggest a
6 more appropriate audience?).
7
8 There are lots of sync utility out there, but my search hasnt found
9 the one utility that has all the features i require. Most lack some
10 of these features, some will have undesirable limitations... I'm
11 currently using unison for all my sync needs, it's the best i found so
12 far but it is very limited on some aspects and it's a bit painful on
13 my setup. Make sure i clearly refuse to even consider network
14 filesystems, and the reason is i need each computer to be fully
15 independent from each other, i sync my important files so to have a
16 working backup on all my pcs (my laptop breaks? fine, i just start my
17 desktop and continue working transparently, well, with last sync'ed
18 files). Any kind of NFS could be considered for doing the file
19 transfers, but i dont think any of them can compete with rsync, so
20 they're out of the question.
21
22 Now, i know some of you will have the reflex to say: try Such tool, it
23 support 4 out of your 5 requirements. Or try Such tool, it supports
24 them all, but you'll have to bend things a bit to make it work like
25 you want.... I'm looking for the perfect solution, and if it doesnt
26 exist, well, i'm about to code it in C or C++, i have the design ready
27 and the concept is very simple yet provides all my features. I wish
28 to publish the result as open software (probably with a license like
29 BSD or maybe LGPL, maybe but hopefully not GPL) and what i'm about to
30 code will be compatible Linux and MacOSX for sure, a port to windows
31 will require some dumb extensions (such as windows path to unix path
32 conversion, and file transfer support) and it will use very little
33 deps. My project intends to use rsync for the transfer, and so my
34 project will basically extend rsync with all my required features.
35 Rsync does the transfer, i can't compete with how good rsync is at
36 transfering (works through ssh, rsh, through its daemon, does
37 differential transfers, transfers attributes/ownership...), but my
38 project will be better at finding what needs to be transfered, what
39 needs to be deleted and this on as many computers you want and in one
40 shot.
41
42 Here are the features that i seek/require (that i will be programming
43 if no utility can provide them all, the list is actually longer, but i
44 can live without the items not written here):
45
46 -Little space requirements: I could use rsync to make an
47 incremental backup using hardlinks, and basically just copy whatever
48 is "new" on each replica, but this takes way too much space and still
49 doesnt deal with deletes properly (ie a file is on A and B, gets
50 deleted on A and on B and recreated on B. In reality we have a new
51 file on B, but rsync might want to delete this new file on B thinking
52 it's the file that got deleted on A, unison works admirably here, it
53 finds the first file effectively got deleted on both, nothing to do,
54 and new file appeared on B which needs to be transfered to A... the
55 space unison uses to cache its date is about 100mb now, and i havent
56 cleaned it since i started using it, i believe more than half of it
57 could be removed, even 100mb still represents about 1% of what is
58 sync'ed).
59
60 -Server-less: I dont want to maintain a server on even a single
61 computer. I like unison since it executes the server through ssh only
62 when used, it's never listening, it's never started at boot time.
63 This is excellent behavior and simplifies maintenance.
64
65 -Bidirectional pair-wise sync: Meaning i can start the sync from
66 host A or from host B, the process should be the same, should take
67 same amount of time, result should be the same. I should never have to
68 care where the sync is initiated. (Unison doesnt support this, but
69 it's ok to sync from both directions, it's just not optimised)
70
71 -Star topology: Or any topologise that allow syncing multiple
72 computers at once... I'm tired of doing several pairwise syncs since
73 to do a full sync of my 3 computers (called A,B and C), i first have
74 to sync A->B and A->C, at this point A contains all the diffs and is
75 sync'ed, but i have to do it once more A->B and A->C to sync the
76 others (ie so B gets C's modifs).
77
78 -Anarchic mode: hehe however you call it, using the same 3 hosts,
79 i'd like to be able to do a pairwise sync between: A->B, A->C and also
80 B->C. To have the sync process decentralised... This is possible
81 with unison but of course i have to ssh to the remote host i want to
82 sync with another remote host.
83
84 -Intelligent conflict resolution: Let's face it, the sync utility
85 wasnt gifted with artificial intelligence, so why bother? It should
86 depend on the user's intelligence, but it should depend on it
87 intelligently. Meaning, it should remember (if users wants it) the
88 resolution of a given conflict to always resolve it this way. This
89 could effectively help in having some files mirrored from A->B, some
90 others mirrored from B->A, some others to be backed up before being
91 overwritten and some would always require user interactivity (like my
92 current project's file)... This is a matter of preference and any
93 utility that dont understand this works against me. No tool i've
94 encountered supports this, unison could do some of these but i would
95 have to break the sync'ing process into multiple smaller syncs, and
96 most tool will just shoot a list of all conflicts and as wheter to
97 keep local, keep remote, ignore, cancel, and this for each and every
98 conflict (the list is long, the cancel option is tempting!).
99
100 -Friendly config/maintain: I have the friendly user in mind (me),
101 meaning the tool should be user-friendly! User-friendly doesnt mean
102 graphical interface with lots of eyecandy (this makes people fat, it's
103 hostile to me, not friendly at all!). However, I like to have only
104 one config file to edit for all my needs, or a directory containing
105 one level of files, a few files, each logically separated (think about
106 /etc/portage) and most of all documented, intuitive.
107
108 These are the features i need most. I am tired of 'working around'
109 limitations or missing features. I am tired of having to do multiple
110 syncs to get my whole house up2date.
111
112 And finally, thanks to those that were interested in my post enough to
113 read as far as here (unless you jumped straight here, but thank you
114 still for taking the time!). I'm desperate at creating a project that
115 will be useful to me and hopefully to others too. I'm a very good
116 C/C++/PHP/JS programmer but i could only rarely find work in that
117 field since i have no diploma (highschool diploma from 10 years ago
118 that's all). Due to some illness i've lived a terribly unstable life
119 and i've had an exploratory tendency in development, meaning i've
120 started about 10K projects, but finished none. I have published
121 nothing so far... in other words, i am nobody, and for companies, i am
122 a risk, even if i ask half the usual salary it's still a division by
123 zero: salary divided by zero credibility (ie no diploma and no work
124 xp). If i can build this project (on my own for the start) and
125 publish it, i think it would help me a lot professionally. Also, once
126 the first version is out, i'll clearly welcome patches from the
127 community and having a team work will help even more. Also, very
128 important to note, i am currently unemployed, collecting unemployment
129 insurrance as income, i still have about 2 months left of free time to
130 get my professional situation back on track, this 2 months of my
131 expertise is more than enough to get a good stable beta version of
132 this project. But i need to get it started, i must be convinced this
133 is the right choice.
134
135 Thanks for reading, hopeful to be reading your answers!
136 Simon

Replies

Subject Author
Re: [gentoo-user] File synchronisation utility (searching for/about to program it) "Alan E. Davis" <lngndvs@×××××.com>