Gentoo Archives: gentoo-scm

From:	Rich Freeman <rich0@g.o>
To:	gentoo-scm@l.g.o
Subject:	[gentoo-scm] Git Conversion Validation
Date:	Sun, 07 Oct 2012 22:23:29
Message-Id:	`CAGfcS_=hPO_AcY8pAwC70x-C0AbSUUFxKi7PpyRsQk1iGsLiGg@mail.gmail.com`

1	FYI - I started a repository of my git validation work at:
2	git://github.com/rich0/gitvalidate.git
3
4	I'm starting on the git side first. I'm taking all my data directly
5	from the git executables and plan to do the same for cvs - if they
6	output the same content we should be OK. I did some testing and I
7	think that my code should handle unicode output if git generates it.
8
9	The git repository has 1259922 commits, and it takes 50.5 seconds to
10	walk the list of commits to produce of trees and their commit info.
11
12	Next step is to iteratively perform the map / reduce algorithm I
13	outlined earlier to get a per-file history similar to what cvs
14	captures.
15
16	Contributions welcome. I'm finding the main issue is cutting down the
17	overhead of spawning git processes to do the work. While it will make
18	for more work in theory I might just have git-ls-tree recurse the
19	trees to reduce the subprocess overhead and then just do the extra
20	sorting/de-duplication in python. I'm trying to avoid using git
21	implementations in python since that might expose us to bugs.
22
23	Rich

Subject	Author
Re: [gentoo-scm] Git Conversion Validation	Peter Stuge <peter@×××××.se>