Gentoo Archives: gentoo-scm

From: Rich Freeman <rich0@g.o>
To: gentoo-scm@l.g.o
Subject: Re: [gentoo-scm] Git Conversion Validation
Date: Mon, 08 Oct 2012 02:11:34
Message-Id: CAGfcS_mzVug9mjtWyCgihxPkzB6VXKBAh0CbCgXc+mOPDff4ng@mail.gmail.com
In Reply to: Re: [gentoo-scm] Git Conversion Validation by Peter Stuge
1 On Sun, Oct 7, 2012 at 6:37 PM, Peter Stuge <peter@×××××.se> wrote:
2 > Rich Freeman wrote:
3 >> I'm trying to avoid using git implementations in python since that
4 >> might expose us to bugs.
5 >
6 > Take a look at libgit2+pygit2.
7
8 Well, my goal was to try to stick to the output of the official
9 commands, figuring that this is essentially the standard to go by. My
10 understanding is that subtle problems with character encodings and
11 such were found in past conversation efforts. If unusual characters
12 are being modified by the conversion program I want to avoid the
13 verification program making the same mistake and therefore obscuring
14 the problem.
15
16 That said, spawning git several million times is looking to be REALLY
17 slow, so I think I might bite the bullet and use a library. It seems
18 like pygit2 is designed to use unicode for everything.
19
20 And of course the risk that pygit2/etc has bugs really isn't
21 necessarily greater than the risk that my own stuff has bugs (though
22 knowing my intended use I can probably minimize the ones that count -
23 the logic really is simple).
24
25 The repository contains currently what should be a working
26 implementation (though it doesn't write the final list out to disk).
27 It is just WAY too slow to run (hence the command line parameter to
28 limit the number of commits examined).
29
30 Pretty busy for a few days, but I'll convert the git spawning and
31 output parsing to pygit2 calls. As an added bonus I don't have to
32 deal with the fact that git just LOVES to mangle its output to be
33 pleasing to eyes and less so to robots.
34
35 Rich