Gentoo Archives: gentoo-portage-dev

From: Brian Harring <ferringb@×××××.com>
To: gentoo-portage-dev@l.g.o
Subject: Re: [gentoo-portage-dev] Package compression header for binhosts
Date: Tue, 01 Jun 2010 21:23:22
Message-Id: AANLkTindg00VxUmyciXXkcADXCnwYiQuY9NX40l3uOmo@mail.gmail.com
In Reply to: Re: [gentoo-portage-dev] Package compression header for binhosts by Ned Ludd
1 On Tue, Jun 1, 2010 at 1:01 PM, Ned Ludd <solar@g.o> wrote:
2
3 > On Mon, 2010-05-31 at 22:16 -0700, Brian Harring wrote:
4 > > On Mon, May 31, 2010 at 08:32:34PM -0700, Zac Medico wrote:
5 > > > Hi,
6 > > >
7 > > > In order to support alternative compression types for binhost
8 > > > packages, I was thinking about adding support for a header field in
9 > > > the Packages index file. For example, a header line like
10 > > > "PACKAGE_EXTENSION: txz" could be used to indicate that clients
11 > > > should download files with txz extensions instead of tbz2
12 > > > extensions. I'm planning to add support for both tgz [1] and txz
13 > > > extensions.
14 > > >
15 > > > [1] http://bugs.gentoo.org/show_bug.cgi?id=142579
16 > >
17 > > 1) requires a version header bump
18 >
19 > Agreed. But there were some other pending changes for "VERSION: 1"
20 >
21 > Any planned changes to the format should be documented on
22 > https://bugs.gentoo.org/show_bug.cgi?id=263994
23 >
24 >
25 > > 2) a header alone isn't useful unless it's specifiable per cpv entry;
26 > > thus it must be inheritable
27 >
28 > Per CPV entries is going to bloat the format and make me carry around a
29 > more data on a per pkg basis then I'd want to. How about we run with
30 > zac's idea but use tools to convert a full repo over to $EXTENTION
31 > This should keep the portage code fast as well as it checks for invalid
32 > binpkgs all the time. Having to have portage process a ton of ever
33 > growing extentions is just going to be slow.
34 >
35
36 Note I said 'inheritable'; one of the main flaws w/ version 0 is that it
37 requires quite a few entries per CPV, instead of setting a default in the
38 preamble and then overriding as needed at the CPV level.
39
40 What I'm suggesting is a COMPRESSOR in the preamble, and individual cpv's
41 override it if they're not that compressor.
42
43 As for zacs tool to try and generate new views of a repository via
44 hardlinking/recreating the tree... frankly it's a bit of a hack. Via
45 DEFAULT_URI and relying on the hash, you can make a stable repository that
46 is able to be updated in place without corrupting ongoing downloads- simply
47 put, new additions to the repo don't perturb current DL's since the md5 is
48 the same (hash collision chance is low enough that I don't care about it
49 here).
50
51
52 > > 3) PACKAGE_EXTENSION is overly verbose and unclear it's specifying
53 > > the compressor too; it's intention is for compression, state it as
54 > > such (I mention this in light of URI's existance where
55 > > PACKAGE_EXTENSION would only be a hint of compressor)
56 > >
57 > > Re: #1, there is a decent set of optimizations I'm kicking around in
58 > > pkgcore for the next version- a discussion should probably be started
59 > > there.
60 > >
61 > > Offhand, having a compression specific header (a simple enumeration
62 > > of known compressors) and a DEFAULT_URI that is python string
63 >
64 > No go bro. The 'Packages' format should be independent of python.
65 >
66
67 > > interpolation assembled (for example,
68 > > DEFAULT_URI="%(host)s/%(category)s/%(pf)s.txz") seems wiser. Via
69 > > doing what I'm suggesting, it would be possible to do binpkg
70 > > repository 'views' w/out having to map each binpkg into the url space
71 > > for it.
72 >
73
74 Then come up w/ an alternative w/ the same power as DEFAULT_URI that isn't
75 python specific; think through the potentials of it, I could very easily
76 centralize the binpkgs for an arch, use the hash as they're lookup value,
77 then use the Packages cache as a 'view' into that binpkg repository.
78 Differing use flag combinations, differing license views, hell, differing
79 ACCEPT_KEYWORDS, all of that can have the raw pkgs stored centrally while
80 just providing differing views into it- DEFAULT_URI lays the groundwork for
81 it.

Replies