Gentoo Archives: gentoo-dev

From: Marius Mauch <genone@g.o>
To: gentoo-dev@l.g.o
Subject: [gentoo-dev] Multi hash support in portage - status
Date: Thu, 24 Nov 2005 00:07:27
Message-Id: 20051124010432.33eecead@sven.genone.homeip.net
So, along with the gpg signing stuff came along again the question to
have multiple hash formats in digests and manifests.

Current status is that portage only generates MD5 checksums and can
verify both MD5 and SHA1 checksums. Creation of SHA1 is also possible
but has so far been disabled as older portage versions would break if
they found a non-MD5 line in digest files (this was fixed somewhere
last year in the .51 series).

Ok I have three modifications that are pending to go into portage:
- The first simply enables creation of SHA1 checksums (and others if
implemented like with the second mod), if you want to try it yourself
see the attached patch.
- Another thing that has been requested often is to offer even more
hashing functions. Earlier today I sent a patch to the
gentoo-portage-dev list that adds optional support for SHA256 and
RMD160 if dev-python/pycrypto is installed on the system.
- The last and most intrusive change is support for a new enhanced
Manifest format (called Manifest2 for now). Don't worry, there will be
GLEP and more info before this gets added, I just list here for
reference below

The first two changes are ready to be added and deployed (in .54), so
without looking at the third one it would be a no brainer. But of
course there is a drawback: The current Manifest/digest format is quite
inefficient wrt storing multiple checksums as it repeats the filename
and filesize for every checksum added, it looks like this:

MD5 82e806ed62f0596fb7bef493d225712f metadata.xml 269
RMD160 39d775de55f9963f8946feaf088aa0324770bacb metadata.xml 269
SHA1 4fd7b285049d0e587f89e86becf06c0fd77bae6d metadata.xml 269
SHA256 3787959f4a775b1e787b35ff8380949d8f68bd67b81c2cf5a748733c9740cb94
metadata.xml 269

The Manifest2 format solves this problem (and has some other benefits)
by listing all checksums on one line:

MISCFILE metadata.xml 269 RMD160
39d775de55f9963f8946feaf088aa0324770bacb SHA1
4fd7b285049d0e587f89e86becf06c0fd77bae6d SHA256
3787959f4a775b1e787b35ff8380949d8f68bd67b81c2cf5a748733c9740cb94 MD5
82e806ed62f0596fb7bef493d225712f

Not much of a difference you might say, but this is just looking at a
pure Manifest2 entry. To keep compability with existing portage
versions we have to list both the old format and the new format in the
Manifest (digests are handled differently with Manifest2, but the
concept also applies to them), which can potentially increase the tree
size by ~10% (at a guess). I'm talking about actual data size here, not
required discspace.
And before you ask "why manifest2 if it adds this overhead?",
the main point isn't the new format but a long term reduction of the
tree size by removing the digest files (but wait for the GLEP to discuss
this).

So much for background information, now to the actual question:
Would you rather have now the ability to create multi-hash digests and
Manifests with the result of a short and mid-term larger portage tree
(in the long term the format will be phased out hopefully) or rather
wait for Manifest2 support (which will definitely include multi hash
support)?

Basically just getting some feedback before adding it and later getting
the complaints about bloating the tree ;)

Note that this is (technically) completely unrelated to gpg signing of
Manifests, so any gpg related bitching doesn't belong here.
Generally only reply here if you're replying to the question I posted
(implementation discussions belong on gentoo-portage-dev, and the
quota for whining/trolling/flaming for this month was already exceeded).

Marius

-- 
Public Key at http://www.genone.de/info/gpg-key.pub

In the beginning, there was nothing. And God said, 'Let there be
Light.' And there was still nothing, but you could see a bit better.

Attachments

File name MIME type
signature.asc application/pgp-signature

Replies