1 |
On Wed, Oct 17, 2012 at 8:49 AM, Peter Stuge <peter@×××××.se> wrote: |
2 |
> Rich Freeman wrote: |
3 |
>> I'm storing a history of when files change, starting by looking at |
4 |
>> commits in complete isolation. |
5 |
> |
6 |
> The more gitty way would be to use the trees. |
7 |
|
8 |
I was being a bit informal in my description. The actual map/reduce |
9 |
steps don't look at commits at all - only trees/blobs. The initial |
10 |
parse of the commits extracts all the trees and the commit info we |
11 |
care about. There is no way to get to blobs from commits except |
12 |
through trees. |
13 |
|
14 |
> |
15 |
> It would be two passes. First pass tree[n%2=0] with tree[n+1] and on |
16 |
> second pass tree[n%2=1] with tree[n+1]. |
17 |
|
18 |
Sure, and you can break that down much further. If you write each |
19 |
commit with the one preceeding it on the same line you can even do it |
20 |
with map. It is a big change in the algorithm. Actually, by doing it |
21 |
that way you could just do a complete pairwise compare of the whole |
22 |
tree and glean everything on a single iteration. I think. To do it |
23 |
I'd just write each line of my csv with twice as many fields - the |
24 |
second half of each line being the first half of the next, or the one |
25 |
before, or however it works. First or last would be a special case. |
26 |
|
27 |
>> One way that comes to mind is to do a second pass that just looks |
28 |
>> for deletions, using the cvs data to cheat - each deletion could be |
29 |
>> checked in parallel doing a pairwise compare on a single file. |
30 |
> |
31 |
> It would compare trees, not files, but sure, that works too. It will |
32 |
> not notice if something extra has been deleted in git however, if the |
33 |
> same file does not have any further changes in CVS. |
34 |
|
35 |
Sort-of. If a file is deleted early then it will show up as being |
36 |
missed when the file should have been deleted. Files that never were |
37 |
deleted but which were deleted in git are very easy to detect - a |
38 |
simple file compare of a checkout of each would spot that. |
39 |
|
40 |
Rich |