Update on this TODO list, and extension for new items.
I've deliberately broken the threading so that more people read it. This
was a response to the mail with the subject of "Converting a recent CVS
copy - Item 2: statistics"
- We've gone from 18.5 hours to 9 hours, all in a single portion of the
conversion, thanks to help from upstream. There's lots more room for
- C (for git) and Python (for cvs2svn) coders very welcome to challenge
- Testers wanted
- Actually forming this up to a project with a team is probably due
soon. I've had interest/direct offers of help from: WilliamH, Calchan,
Betelgeuse over the last week.
If you want something, pick it from this mail, and try to flesh it out
on the list with me while you work on it.
New TODO items:
- Finish new hooks for git:
upload-pack.c - get_common_commits:
right before the 'return 0', check for and launch a new hook, passing
the have/want headers via stdin, and use the return code (AND the
stderr) to see if we should halt.
- Review commit signing
- pclouds (a former Gentoo dev) contributed this prototype:
- I'm not entirely convinced the above is right, as the commit message
seems to end up unsigned.
- Wait for the commit-notes patches onto upstream Git?
- Test git-cvsserver usage.
- Both remote and local modes.
- mips and narrow checkouts may want this.
On Tue, Apr 14, 2009 at 01:33:24AM -0700, Robin H. Johnson wrote:
> - Could somebody with Python-foo please look hard at cvs2svn with an eye to
> making it multi-threaded?
> - Focus on pass1 and pass9.
> - pass8 maybe as well, but I think it will be harder by design.
- pass1 optimization was completed 20 minutes ago by mhagger (one of the
cvs2svn upstream lead developers). From an old time of 36204 seconds,
it's now only 1598 seconds (and we only used 2 CPU cores so far, we
have 6 more for later). 22x speedup :-).
- mhagger from upstream needed hardware to test on, so I hooked him up
with access to the experimental conversion box.
- TODO: pass9/pass8 remain.
- Need to validate output of new mode against the previous mode.
> - We need incremental conversion stuff badly.
Incremental may become practically unneeded if we can get the conversion
time under 2 hours.
> - I had to use the RCSRevisionReader, as InternalRevisionReader seemed to be
> broken. Would make pass1 faster as well.
This is now completely unneeded. The pass1 solution integrated the
rcsparse code into the new generate_blobs.py.
> - Should probably ignore the '.frozen' files.
No support to exclude files presently.
TODO: Implement inside def _generate_cvs_files
> - Review RCS state of ALL ,v files. There are a few non-dead files in Attic.
Thanks to William Hubbs (williamh) for his review. I've got a few more
items to process from this review.
> - Maybe trim out the Manifest/digest contents during the conversion, leaving
> only DIST lines?
> Con: _WOULD_ break old GPG signatures.
> Pro: probably help size a lot.
- I think it will massively slows down the conversion right now.
- Nothing more than a bad prototype I did. I'd like somebody else to
attack the problem without having seen my prototype, rather than be
infected with my bad ideas in it.
- Subtree checkouts
- Is progressing upstream. Now known as checkout modes: narrow,
= narrow: some directory that is not the root.
= sparse: a subset of files in a directory.
= shallow: subset of recent history.
Robin Hugh Johnson
Gentoo Linux Developer & Infra Guy
E-Mail : email@example.com
GnuPG FP : 11AC BA4F 4778 E3F6 E4ED F38E B27B 944E 3488 4E85