1 |
On Tue, Oct 16, 2012 at 1:59 PM, Rich Freeman <rich0@g.o> wrote: |
2 |
> Once I have both I can start working on validation rules and perhaps |
3 |
> get feedback to the conversion team. We'll need to work out what does |
4 |
> and doesn't count as OK. |
5 |
|
6 |
Hmm, it didn't take long after working through the cvs side of things |
7 |
to realize that my design will fail to detect file deletions, at least |
8 |
in part. I'm storing a history of when files change, starting by |
9 |
looking at commits in complete isolation. In cvs the deletion of a |
10 |
file shows up as the existance of an entry that changes the file |
11 |
state. In git the deletion of a file is only marked by the absence of |
12 |
the file in the tree - looking at the tree in isolation you'd never |
13 |
know that there used to be a file there. |
14 |
|
15 |
Right now I can't tell when a file was deleted, or even be sure it |
16 |
ever was deleted. Any commit info associated with deletion cannot be |
17 |
checked. If one of those commits happened to include modifications |
18 |
(but not deletions) to other files then those would be checked. |
19 |
|
20 |
Off the top of my head I can't think of a simple fix (other than |
21 |
pairwise comparison the way that git show $commit works - which would |
22 |
be much more difficult to run in parallel). One way that comes to |
23 |
mind is to do a second pass that just looks for deletions, using the |
24 |
cvs data to cheat - each deletion could be checked in parallel doing a |
25 |
pairwise compare on a single file. If I only wanted to check that the |
26 |
file got deleted at about the right time I could probably tweak the |
27 |
reduce function to keep around the hash for the last time a file was |
28 |
present. The author and message info would be worthless, but the |
29 |
timestamp would be fairly close since we have so many commits. |
30 |
|
31 |
Well, something to come back to once I get the cvs side done. If |
32 |
anybody has thoughts let me know... |
33 |
|
34 |
Rich |