Gentoo Archives: gentoo-scm

From: "Robin H. Johnson" <robbat2@g.o>
To: gentoo-scm@l.g.o
Subject: [gentoo-scm] Progress summary, 2009/06/01
Date: Mon, 01 Jun 2009 23:46:32
Update on this TODO list, and extension for new items.
I've deliberately broken the threading so that more people read it. This
was a response to the mail with the subject of "Converting a recent CVS
copy - Item 2: statistics"

Executive summary:
- We've gone from 18.5 hours to 9 hours, all in a single portion of the
  conversion, thanks to help from upstream. There's lots more room for
- C (for git) and Python (for cvs2svn) coders very welcome to challenge
  the problems.
- Testers wanted 
- Actually forming this up to a project with a team is probably due
  soon. I've had interest/direct offers of help from: WilliamH, Calchan,
  Betelgeuse over the last week.
  If you want something, pick it from this mail, and try to flesh it out
  on the list with me while you work on it.

New TODO items:
- Finish new hooks for git: 
  upload-pack.c - get_common_commits:
  right before the 'return 0', check for and launch a new hook, passing
  the have/want headers via stdin, and use the return code (AND the
  stderr) to see if we should halt.
- Review commit signing
  - pclouds (a former Gentoo dev) contributed this prototype:
  - I'm not entirely convinced the above is right, as the commit message
	seems to end up unsigned.
  - Wait for the commit-notes patches onto upstream Git?
- Test git-cvsserver usage.
  - Both remote and local modes.
  - mips and narrow checkouts may want this.

On Tue, Apr 14, 2009 at 01:33:24AM -0700, Robin H. Johnson wrote:
> TODO: > - Could somebody with Python-foo please look hard at cvs2svn with an eye to > making it multi-threaded? > - Focus on pass1 and pass9. > - pass8 maybe as well, but I think it will be harder by design.
- pass1 optimization was completed 20 minutes ago by mhagger (one of the cvs2svn upstream lead developers). From an old time of 36204 seconds, it's now only 1598 seconds (and we only used 2 CPU cores so far, we have 6 more for later). 22x speedup :-). - mhagger from upstream needed hardware to test on, so I hooked him up with access to the experimental conversion box. - TODO: pass9/pass8 remain. - Need to validate output of new mode against the previous mode.
> - We need incremental conversion stuff badly.
Incremental may become practically unneeded if we can get the conversion time under 2 hours.
> - I had to use the RCSRevisionReader, as InternalRevisionReader seemed to be > broken. Would make pass1 faster as well.
This is now completely unneeded. The pass1 solution integrated the rcsparse code into the new
> - Should probably ignore the '.frozen' files.
No support to exclude files presently. TODO: Implement inside def _generate_cvs_files
> - Review RCS state of ALL ,v files. There are a few non-dead files in Attic.
Thanks to William Hubbs (williamh) for his review. I've got a few more items to process from this review.
> - Maybe trim out the Manifest/digest contents during the conversion, leaving > only DIST lines? > Con: _WOULD_ break old GPG signatures. > Pro: probably help size a lot.
Additional con: - I think it will massively slows down the conversion right now. - Nothing more than a bad prototype I did. I'd like somebody else to attack the problem without having seen my prototype, rather than be infected with my bad ideas in it. Upstream misc: - Subtree checkouts - Is progressing upstream. Now known as checkout modes: narrow, sparse, shallow = narrow: some directory that is not the root. = sparse: a subset of files in a directory. = shallow: subset of recent history. -- Robin Hugh Johnson Gentoo Linux Developer & Infra Guy E-Mail : robbat2@g.o GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85