Gentoo Archives: gentoo-dev

From:	Brian Harring <ferringb@×××××.com>
To:	mgorny@g.o
Cc:	gentoo-dev@l.g.o
Subject:	Re: [gentoo-dev] Re: metadata/md5-cache
Date:	Mon, 04 Jun 2012 13:15:52
Message-Id:	`20120604131513.GA23002@localhost`
In Reply to:	Re: [gentoo-dev] Re: metadata/md5-cache by "Michał Górny"

1	On Mon, Jun 04, 2012 at 09:27:10AM +0200, Micha?? G??rny wrote:
2	> On Sun, 3 Jun 2012 09:48:26 +0000
3	> "Robin H. Johnson" <robbat2@g.o> wrote:
4	>
5	> > On Sun, Jun 03, 2012 at 11:34:07AM +0200, Micha?? G??rny wrote:
6	> > > I means using separate proto for metadata, not necesarrily git. In
7	> > > any case, if it comes to transferring a lot of frequently-changing
8	> > > files, rsync is not that efficient...
9	> > It does NOT send any of the intermediate states.
10	>
11	> But it does have to check all the files.
12
13	Which is a pretty minimal cost in the grand scheme of things. You
14	also need to figure out what 'efficiency' you're going to talk about
15	here; network io, disk io, cpu io, etc. Most people in this case care
16	about network IO; rsync's not perfect, but for reasons described
17	below, it's the best of breed for the usage scenario.
18
19	> Did I mention I'm not talking necessarily about git?
20
21	Git would be sanest if you were after this; it already does point to
22	point delta transformations sanely. No point in reinventing a VCS; if
23	you can't force the tree back to a known good state (aka, distributed
24	VCS), you can't apply deltas to it, which case you need an rsync like
25	algo.
26
27
28	> Rather anything which would just
29	> lookup our timestamp, revision or whatever and just send what have
30	> changed, in a packed manner.
31
32	This would be reinventing git/VCS, or more likely, pretending that a
33	timestamp file automatically means the repository is unmodified, and
34	trying to do a point to point transformation on it. Where you're
35	notion breaks down is that fun little bit about "unmodified".
36
37	This is why rsync is used; it's not limited to a point to point
38	transformation, it's able to work from any starting point
39	efficiently.
40
41	Either way, suggest you do some research into this- including
42	efficiencies of rsync, git, existing snapshot delta rsync machinery
43	(tarsync, diffball, etc), study the trade offs inherint in each. Your
44	initial email frankly reaks of NIH, hence my suggestions to go
45	investigate what exists now.
46
47	~harring

Report Message

Find on MARC Find on Google Groups