Gentoo Archives: gentoo-scm

From: Rich Freeman <rich0@g.o>
To: gentoo-scm@l.g.o
Subject: [gentoo-scm] Git Conversion Validation
Date: Sun, 07 Oct 2012 22:23:29
Message-Id: CAGfcS_=hPO_AcY8pAwC70x-C0AbSUUFxKi7PpyRsQk1iGsLiGg@mail.gmail.com
1 FYI - I started a repository of my git validation work at:
2 git://github.com/rich0/gitvalidate.git
3
4 I'm starting on the git side first. I'm taking all my data directly
5 from the git executables and plan to do the same for cvs - if they
6 output the same content we should be OK. I did some testing and I
7 think that my code should handle unicode output if git generates it.
8
9 The git repository has 1259922 commits, and it takes 50.5 seconds to
10 walk the list of commits to produce of trees and their commit info.
11
12 Next step is to iteratively perform the map / reduce algorithm I
13 outlined earlier to get a per-file history similar to what cvs
14 captures.
15
16 Contributions welcome. I'm finding the main issue is cutting down the
17 overhead of spawning git processes to do the work. While it will make
18 for more work in theory I might just have git-ls-tree recurse the
19 trees to reduce the subprocess overhead and then just do the extra
20 sorting/de-duplication in python. I'm trying to avoid using git
21 implementations in python since that might expose us to bugs.
22
23 Rich

Replies

Subject Author
Re: [gentoo-scm] Git Conversion Validation Peter Stuge <peter@×××××.se>