On Thursday 12 June 2003 02:56 am, Seemant Kulleen wrote:
> Anyway, the current approach keeps it simple in that the md5sum is off the
> *item(s) that is/are downloaded*. The first reason I can see is what I
> stated above. There are other reasons I can see as well. You know,
> immediately upon fetching the set of source items that they are bad. So,
> no disk i/o or cpu cycles are spent in the unpacking; and no potentially
> nasty code is even untarred on the system, yet.
Well, I can think of a way to address this part of the issue, anyway.
* The portage tree still has MD5 digests for the item(s) which is/are
* After downloading, emerge executes a (user specified?) program on the newly
downloaded file. This program applies some transform to the file; maybe it
decompresses whatever format the file is in and re-compresses it with bzip2,
or maybe it only format-shifts files which are over a certain size threshold,
* Next, emerge adds a new record to a database (text, one record per line, for
example) somewhere in /var. This database has the original name of the
downloaded file, the original MD5 digest, the new name, and the new MD5
* When emerge wants a new file, it checks the database to see if the desired
file has been mapped to a new transformed name and MD5.
You'd probably want to make the digest file readable only by the portage user
With infrastructure like this you could even add more interesting
functionality to portage pretty easily. Like maybe the transform program
uploads the file to the corporate internal FTP mirror, and the database maps
the original name to the URI which locates it.
Or, if emerge exported sufficient context to the transform program, you could
fix the case where a particular braindead package is available only as
package.tar.gz, not package-version.tar.gz. The transform program would add
the version to the filename, and the database would be allowed to have
multiple entries for each original file name (provided they had different MD5
digests). Then emerge would just pick the record with both the desired
original name and the desired original MD5 digest.
email@example.com mailing list