1 |
Good morning, |
2 |
|
3 |
I want to talk about improved binary package support for Gentoo. About |
4 |
1-2 months ago there already was a discussion about this on gentoo-soc@ |
5 |
and on bugzilla [1]. If I remember correctly, there were no devs |
6 |
involved in the discussion, so I thought I'll post my thoughts here. |
7 |
|
8 |
I know, that Gentoo is a source-based distribution or meta-distribution, |
9 |
and I don't want to make Gentoo another Fedora or Ubuntu, but I think |
10 |
there are some things we can learn from them. |
11 |
|
12 |
The current situation: |
13 |
|
14 |
Binary packages are (usually) stored |
15 |
in /usr/portage/packages/$category/$package-$version.tbz2. The package |
16 |
consists of the "real binary package" and the metadata (combined using |
17 |
xpak or whatever). |
18 |
|
19 |
Problems I see with this: |
20 |
|
21 |
1) If a binary package is built because it needs to be linked against a |
22 |
new library, because the USE-flags change or because the ebuild changes |
23 |
without a revision bump, the "old" binary package is overwritten. This |
24 |
also means that there is no support to store multiple packages with |
25 |
different USE-flags without, well, using different directories. |
26 |
2) To find out which USE-flags a package is built with, one needs to |
27 |
download the package and look at the metadata. Today I discoveres a file |
28 |
called "Packages" which looks like a metadata cache, but I did not find |
29 |
more information about it (only tried "man portage"). |
30 |
|
31 |
So, how can we address this? |
32 |
|
33 |
First we should do something about 2), I think: I want to propost the |
34 |
following scheme: |
35 |
|
36 |
Binary packages are stored in |
37 |
$arch/$description/$category/$package/$package-$version-$ev-$use-$bv.tbz2. |
38 |
|
39 |
$arch: This is x86, ppc or whatever you put into ACCEPT_KEYWORDS minus |
40 |
the '~'. It does not make sense to make a distinction here. |
41 |
$description: Something like pentium3, core2quad, G4, or whatever. |
42 |
Pentium3-uclibc, Pentium3-solaris-prefix are also possible. |
43 |
$category, $package and $version should be clear. |
44 |
$ev: The "ebuild version". See below. |
45 |
$bv: The "binary version". See below. |
46 |
$use: The USE-flags. See below. |
47 |
|
48 |
About ebuild version, USE-flags and binary version: |
49 |
|
50 |
I would like to encode the USE-flags into the filename. This enables us |
51 |
to have binary packages of the same version built with different |
52 |
USE-flags in the same repository. Some wanted to have this in the |
53 |
directory, some say it is ok to have it in the xpak only and some prefer |
54 |
the "Packages"-like file. |
55 |
|
56 |
I think, USE-flags can be set per package and therefore should be stored |
57 |
per package, not per $description or whatever. Having it only in the |
58 |
xpak allows no distinction between multiple binary packages, same |
59 |
version, differen USE-flags and the same is true for the Packages file. |
60 |
This would also be created, downloaded all the time and so on. Therefore |
61 |
I think the cleanest solution is having USE-flags in the filename. |
62 |
|
63 |
There are different methods to store it there. |
64 |
|
65 |
a) A checksum (of the USE-flags, the USE-flag string, the ebuild and the |
66 |
USE-flag string, whatever). |
67 |
b) List the enabled USE-flags in the filename, use a) if the string gets |
68 |
too long. |
69 |
c) Use a packed binary vector. |
70 |
|
71 |
I don't like a), because it is not easily reversible. You could always |
72 |
download the Packages file or the binary package and look into the xpak |
73 |
metadata, but that's too much effort. b) also has the problems i |
74 |
mentioned for a). Also, you'd need some system to distinguish ebuilds |
75 |
with the same version but different USE-flags. You also need that for |
76 |
c), so b) has no advantages ofer c) in my eyes. |
77 |
|
78 |
For c) I think of the following: Sort the USE-flags in some defined way |
79 |
(ASCII code, whatever) and make a vector with a 1 for every enabled |
80 |
USE-flag and a 0 for every disabled USE-flag. Compress that vector: If |
81 |
you use HEX code, you need 1 character for every 4 bits, but it should |
82 |
be possible to find 64 different characters, then you need 1 character |
83 |
for every 6 bits. PHP has 106 USE.flags, that would make a USE-string |
84 |
with about 18-27 characters. Packages with lots of USE-expand stuff like |
85 |
languages would need more, but not too much, I think. |
86 |
|
87 |
Problems: The string might get long, you get big problems with USE-flag |
88 |
renames, USE-flag additions or removals. That's where the ebuild version |
89 |
is needed. Or not. We have 3 possibilities: |
90 |
|
91 |
a) Change policy: USE-flag changes in an ebuild need a version bump. |
92 |
b) Use a checksum of the ebuild. |
93 |
c) Use the version given by the version control system. |
94 |
|
95 |
The problem with a) is, that is a change in policy and probably hard to |
96 |
do. Increasing the revision for a (trivial) change leads to a lot of |
97 |
unnecessary rebuilds for users. It also means, that USE-flag changes in |
98 |
eclasses are difficult, the eclass should probably copied over to a new |
99 |
name with version and only ebuilds with a new version (revision) are |
100 |
allowed to use it. |
101 |
|
102 |
The problem with b) is, that it is not ordered. You don't know, which is |
103 |
the newest version. If you have an ebuild with a version where there is |
104 |
no binary package for, it gets difficult/ugly. |
105 |
|
106 |
c) also has problems: When using cvs, there are versions easily |
107 |
available. The same is true for svn, but lots of distributed version |
108 |
control systems like git use checksums as versions. Welcome back to b). |
109 |
Another thing is, how do we get to the versions? Will they be in the |
110 |
header forever, since they make signing ebuilds or the manifest much |
111 |
more complicated (multiple commits necessary)? But, well, since metadata |
112 |
is generated and provided by "the tree", it should be not too hard to ad |
113 |
a unique ebuild version there (in the case of checksums, use an integer, |
114 |
increase whenever the checksum changed or something). It just might make |
115 |
using overlays a bit more difficult. |
116 |
|
117 |
The last thing to be discribed is the binary version. Lots of people |
118 |
talk about dependencies to other binary packages when they talk about |
119 |
binary packages for Gentoo, but that gets quite difficult (and, in my |
120 |
opinion, ugly). We mostly need to provide a "consistent set" of |
121 |
packages, which means, if A depends on B, B changes and therefore breaks |
122 |
A, we need to provide an updated version of A. And we can do that with |
123 |
simply increasing the binary version, since the package manager knows |
124 |
then, that this package needs updating, too. |
125 |
|
126 |
How to create binary packages? |
127 |
|
128 |
Create some build server (or build server infrastructure). The most |
129 |
important thing is a script or something that provides the |
130 |
functionality. One enters a make.conf, /etc/portage dir, path to the |
131 |
profile, description and whatever else is needed and the system starty |
132 |
building. Then you can create a second set of data and start building |
133 |
and the system puts the binary packages in the same directory and |
134 |
discovers what needs to be built and what not (because apache needs to |
135 |
be built only once if its USE-flags are the same for the different |
136 |
configuration sets). |
137 |
|
138 |
But there are thousands of packages and millions of USE-flag |
139 |
combinations! |
140 |
|
141 |
Seriously, who cares? The goal of this project (as it exists in my head) |
142 |
is not to provide everything. It is to provide the most used packages. |
143 |
If you need parrot, compile it yourself. If you need netbeans, compile |
144 |
it yourself. We have @system, gnome, kde and anothe hand full of |
145 |
packages, which will change over time. I'm, really lookign forward to |
146 |
the data collected by the statistics project (GSoC). |
147 |
|
148 |
The same is true for USE-flags: We might provide gnome, kde, both, a |
149 |
server profile and whatever we decide to provide, but not everthing. |
150 |
Again, statistics will help. |
151 |
|
152 |
Same with CFLAGS. Probably no -O3, no -ffast-math, no -break-my-code or |
153 |
whatever. Probably x86 with 32 and 64 bit for the beginning, later maybe |
154 |
more. |
155 |
|
156 |
So, the really really cool thing is, that if you are some company, |
157 |
university, institution or freak, with lots of (similar) Gentoo boxes, |
158 |
you can set up a build server and even share the binary packages, if you |
159 |
want. Same level of security as non-official overlays, but in the |
160 |
university of FooBar in Jamaica uses it, there should not be too many |
161 |
security problems. |
162 |
|
163 |
Thanks for reading, please discuss, I probably forgot lots of stuff, but |
164 |
I can tell it later in the discussion. |
165 |
|
166 |
Philipp |
167 |
|
168 |
|
169 |
|
170 |
|
171 |
[1] https://bugs.gentoo.org/show_bug.cgi?id=150031 |