Gentoo Archives: gentoo-dev

From: Philipp Riegger <lists@××××××××××××.de>
To: gentoo-dev@l.g.o
Subject: [gentoo-dev] better support for binary packages
Date: Mon, 25 May 2009 09:54:58
Message-Id: 1243245294.7098.63.camel@hspc31.informatik.uni-stuttgart.de
1 Good morning,
2
3 I want to talk about improved binary package support for Gentoo. About
4 1-2 months ago there already was a discussion about this on gentoo-soc@
5 and on bugzilla [1]. If I remember correctly, there were no devs
6 involved in the discussion, so I thought I'll post my thoughts here.
7
8 I know, that Gentoo is a source-based distribution or meta-distribution,
9 and I don't want to make Gentoo another Fedora or Ubuntu, but I think
10 there are some things we can learn from them.
11
12 The current situation:
13
14 Binary packages are (usually) stored
15 in /usr/portage/packages/$category/$package-$version.tbz2. The package
16 consists of the "real binary package" and the metadata (combined using
17 xpak or whatever).
18
19 Problems I see with this:
20
21 1) If a binary package is built because it needs to be linked against a
22 new library, because the USE-flags change or because the ebuild changes
23 without a revision bump, the "old" binary package is overwritten. This
24 also means that there is no support to store multiple packages with
25 different USE-flags without, well, using different directories.
26 2) To find out which USE-flags a package is built with, one needs to
27 download the package and look at the metadata. Today I discoveres a file
28 called "Packages" which looks like a metadata cache, but I did not find
29 more information about it (only tried "man portage").
30
31 So, how can we address this?
32
33 First we should do something about 2), I think: I want to propost the
34 following scheme:
35
36 Binary packages are stored in
37 $arch/$description/$category/$package/$package-$version-$ev-$use-$bv.tbz2.
38
39 $arch: This is x86, ppc or whatever you put into ACCEPT_KEYWORDS minus
40 the '~'. It does not make sense to make a distinction here.
41 $description: Something like pentium3, core2quad, G4, or whatever.
42 Pentium3-uclibc, Pentium3-solaris-prefix are also possible.
43 $category, $package and $version should be clear.
44 $ev: The "ebuild version". See below.
45 $bv: The "binary version". See below.
46 $use: The USE-flags. See below.
47
48 About ebuild version, USE-flags and binary version:
49
50 I would like to encode the USE-flags into the filename. This enables us
51 to have binary packages of the same version built with different
52 USE-flags in the same repository. Some wanted to have this in the
53 directory, some say it is ok to have it in the xpak only and some prefer
54 the "Packages"-like file.
55
56 I think, USE-flags can be set per package and therefore should be stored
57 per package, not per $description or whatever. Having it only in the
58 xpak allows no distinction between multiple binary packages, same
59 version, differen USE-flags and the same is true for the Packages file.
60 This would also be created, downloaded all the time and so on. Therefore
61 I think the cleanest solution is having USE-flags in the filename.
62
63 There are different methods to store it there.
64
65 a) A checksum (of the USE-flags, the USE-flag string, the ebuild and the
66 USE-flag string, whatever).
67 b) List the enabled USE-flags in the filename, use a) if the string gets
68 too long.
69 c) Use a packed binary vector.
70
71 I don't like a), because it is not easily reversible. You could always
72 download the Packages file or the binary package and look into the xpak
73 metadata, but that's too much effort. b) also has the problems i
74 mentioned for a). Also, you'd need some system to distinguish ebuilds
75 with the same version but different USE-flags. You also need that for
76 c), so b) has no advantages ofer c) in my eyes.
77
78 For c) I think of the following: Sort the USE-flags in some defined way
79 (ASCII code, whatever) and make a vector with a 1 for every enabled
80 USE-flag and a 0 for every disabled USE-flag. Compress that vector: If
81 you use HEX code, you need 1 character for every 4 bits, but it should
82 be possible to find 64 different characters, then you need 1 character
83 for every 6 bits. PHP has 106 USE.flags, that would make a USE-string
84 with about 18-27 characters. Packages with lots of USE-expand stuff like
85 languages would need more, but not too much, I think.
86
87 Problems: The string might get long, you get big problems with USE-flag
88 renames, USE-flag additions or removals. That's where the ebuild version
89 is needed. Or not. We have 3 possibilities:
90
91 a) Change policy: USE-flag changes in an ebuild need a version bump.
92 b) Use a checksum of the ebuild.
93 c) Use the version given by the version control system.
94
95 The problem with a) is, that is a change in policy and probably hard to
96 do. Increasing the revision for a (trivial) change leads to a lot of
97 unnecessary rebuilds for users. It also means, that USE-flag changes in
98 eclasses are difficult, the eclass should probably copied over to a new
99 name with version and only ebuilds with a new version (revision) are
100 allowed to use it.
101
102 The problem with b) is, that it is not ordered. You don't know, which is
103 the newest version. If you have an ebuild with a version where there is
104 no binary package for, it gets difficult/ugly.
105
106 c) also has problems: When using cvs, there are versions easily
107 available. The same is true for svn, but lots of distributed version
108 control systems like git use checksums as versions. Welcome back to b).
109 Another thing is, how do we get to the versions? Will they be in the
110 header forever, since they make signing ebuilds or the manifest much
111 more complicated (multiple commits necessary)? But, well, since metadata
112 is generated and provided by "the tree", it should be not too hard to ad
113 a unique ebuild version there (in the case of checksums, use an integer,
114 increase whenever the checksum changed or something). It just might make
115 using overlays a bit more difficult.
116
117 The last thing to be discribed is the binary version. Lots of people
118 talk about dependencies to other binary packages when they talk about
119 binary packages for Gentoo, but that gets quite difficult (and, in my
120 opinion, ugly). We mostly need to provide a "consistent set" of
121 packages, which means, if A depends on B, B changes and therefore breaks
122 A, we need to provide an updated version of A. And we can do that with
123 simply increasing the binary version, since the package manager knows
124 then, that this package needs updating, too.
125
126 How to create binary packages?
127
128 Create some build server (or build server infrastructure). The most
129 important thing is a script or something that provides the
130 functionality. One enters a make.conf, /etc/portage dir, path to the
131 profile, description and whatever else is needed and the system starty
132 building. Then you can create a second set of data and start building
133 and the system puts the binary packages in the same directory and
134 discovers what needs to be built and what not (because apache needs to
135 be built only once if its USE-flags are the same for the different
136 configuration sets).
137
138 But there are thousands of packages and millions of USE-flag
139 combinations!
140
141 Seriously, who cares? The goal of this project (as it exists in my head)
142 is not to provide everything. It is to provide the most used packages.
143 If you need parrot, compile it yourself. If you need netbeans, compile
144 it yourself. We have @system, gnome, kde and anothe hand full of
145 packages, which will change over time. I'm, really lookign forward to
146 the data collected by the statistics project (GSoC).
147
148 The same is true for USE-flags: We might provide gnome, kde, both, a
149 server profile and whatever we decide to provide, but not everthing.
150 Again, statistics will help.
151
152 Same with CFLAGS. Probably no -O3, no -ffast-math, no -break-my-code or
153 whatever. Probably x86 with 32 and 64 bit for the beginning, later maybe
154 more.
155
156 So, the really really cool thing is, that if you are some company,
157 university, institution or freak, with lots of (similar) Gentoo boxes,
158 you can set up a build server and even share the binary packages, if you
159 want. Same level of security as non-official overlays, but in the
160 university of FooBar in Jamaica uses it, there should not be too many
161 security problems.
162
163 Thanks for reading, please discuss, I probably forgot lots of stuff, but
164 I can tell it later in the discussion.
165
166 Philipp
167
168
169
170
171 [1] https://bugs.gentoo.org/show_bug.cgi?id=150031

Replies

Subject Author
Re: [gentoo-dev] better support for binary packages lxnay@××××××××××××.org
Re: [gentoo-dev] better support for binary packages Kobboi <gentoo@××××××××.be>