Hi Brian & everyone,
I like the beginnings of this idea, in terms of what it could lead to. But before you start coding (that's not an official sanction, just me speaking :P), there are a few concerns. I've been talking with John extensively about his deltup package (which incidentally is in portage already).
The first test I did was on two ~500K tarballs which produced a dtu of ~400K (zsh). The second test I did was on ~17MB tarballs, which produced a dtu of 184K. Now *THAT* knocked my socks off. I'm still looking for them, they flew so far.
So, tonight John and I were talking about this very proposal (and a few nights ago I was thinking on this idea as well, except my thoughts were towards an md5sum of the uncompressed directory (didn't know and still don't know if that's even possible)). However, your approach has merit over mine.
Now, the promised concern bit. Unfortunately, while the majority of the packages do come in a compressed tarball format, there are many (enough to make it a corner case of some concern) packages which do not. Off the top of my head, I can think of .Z (forget which package), .rpm (redhat-artwork), .bin (realplayer). And in some cases, we just get an uncompressed README file in the SRC_URI (or the wacom.c file in xfree, though I'm not certain of it right this moment).
Anyway, the current approach keeps it simple in that the md5sum is off the *item(s) that is/are downloaded*. The first reason I can see is what I stated above. There are other reasons I can see as well. You know, immediately upon fetching the set of source items that they are bad. So, no disk i/o or cpu cycles are spent in the unpacking; and no potentially nasty code is even untarred on the system, yet.
I'm not terribly technically inclined, but I am certain that Daniel or Nicholas (carpaski) can fill in the holes of my reasoning.
So, please understand, that I am not shooting your idea down at all, because I really do like it, and I am definitely a fan of deltup and would like to see it integrated as an official thingywhatsit. However, we must think upon these concerns.
Developer and Project Co-ordinator,
Gentoo Linux http://www.gentoo.org/~seemant
Public Key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x3458780E
Key fingerprint = 23A9 7CB5 9BBB 4F8D 549B 6593 EDA2 65D8 3458 780E