Gentoo Archives: gentoo-soc

From: Zac Medico <zmedico@g.o>
To: gentoo-soc@l.g.o
Subject: Re: [gentoo-soc] GSoC - cache sync/self-contained ebuilds
Date: Sun, 27 Mar 2011 19:40:07
Message-Id: 4D8F927B.6060605@gentoo.org
In Reply to: Re: [gentoo-soc] GSoC - cache sync/self-contained ebuilds by Michael Seifert
1 On 03/27/2011 07:28 AM, Michael Seifert wrote:
2 > -----BEGIN PGP SIGNED MESSAGE-----
3 > Hash: SHA1
4 >
5 > Am 24.03.2011 17:58, schrieb Zac Medico:
6 >> Well, it would be inefficient to open separate TCP connections for
7 >> individual metadata files since there are so many of them and they are
8 >> so small. This is why package managers typically download the metadata
9 >> for all packages as a single bundle. For example, see the type of
10 >> metadata bundle that is used to implement PORTAGE_BINHOST support:
11 >>
12 >> http://tinderbox.dev.gentoo.org/default-linux/x86/Packages
13 >>
14 >
15 > Is there a specific reason why the PORTAGE_BINHOST metadata is different
16 > from the metadata/cache format?
17
18 They have minor differences because they were designed for slightly
19 different use cases:
20
21 The metadata/cache format was designed to be distributed together with
22 the ebuilds that it was generated from. Its main drawback is that it can
23 be slow to read many small files since it may require lots of disk
24 seeks. We could pack them all into a single file, similar to one that
25 PORTAGE_BINHOST uses. That would help for tools like eix since it's
26 faster to read one big file than many small files. However, if it was
27 fetched earlier and separate from the ebuilds, it wouldn't be very
28 practical for dependency calculations unless you provided a way to fetch
29 exactly the same revisions of ebuilds (and inherited eclasses which can
30 modify dependencies) that the earlier fetched metadata corresponds to.
31 For example, the cache could be made to refer to a UUID would be used to
32 generate a URI in order to fetch a particular revision of ebuild/eclass
33 bundle that exactly corresponds to the cache entry.
34
35 The PORTAGE_BINHOST cache format is better than the metadata/cache
36 format for the use case that it's designed for, however the current
37 design has a race condition which has been experienced by chromium-os
38 developers:
39
40 http://code.google.com/p/chromium-os/issues/detail?id=3225
41
42 > I like the BINHOST metadata better, even if it is split up into several
43 > files, because it would already contain the ebuild version it was
44 > generated for. Probably it would be a good idea to merge the information
45 > of both metadata into a single unified format?
46 > This would also solve the problem with the missing version control
47 > (described below) as well as simplifying the way portage handles
48 > metadata. On the other hand it would be an even more substantial change,
49 > which is not necessarily a bad thing. Portage is supposed to work the
50 > same way as before – just faster.
51
52 Well, a new cache format is only part of the solution. In order to
53 provide revision control that's necessary for practical dependency
54 calculations when the cache is fetched earlier that the
55 ebuilds/eclasses, you're also going to need to create individually
56 fetchable revisioned ebuild/eclass bundles that the cache will refer to
57 (without any race conditions).
58
59 >> It's conceivable that you could simply use rsync to sync the
60 >> metadata/cache/ subdirectory from
61 >> rsync://rsync.gentoo.org/gentoo-portage/. However, since the rsync tree
62 >> constantly mutates and doesn't provide any kind version control, it
63 >> would not be very practical to use it in this way. If you fetch the
64 >> metadata and the ebuilds separately, you need a way to guarantee that
65 >> you can fetch exactly the same revisions of ebuilds that the earlier
66 >> fetched metadata corresponds to.
67 > -----BEGIN PGP SIGNATURE-----
68 > Version: GnuPG v2.0.17 (GNU/Linux)
69 > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
70 >
71 > iEYEARECAAYFAk2PSYAACgkQnzX+Jf4GTUyz7ACcCV44bXSEwoyCg/6uMz8E9/2g
72 > c+EAn1m/BpF7rKkSSmpouousupVCbUHL
73 > =G4GS
74 > -----END PGP SIGNATURE-----
75 >
76
77
78 --
79 Thanks,
80 Zac

Replies

Subject Author
Re: [gentoo-soc] GSoC - cache sync/self-contained ebuilds Michael Seifert <michael.seifert@×××.net>