Gentoo Archives: gentoo-dev

From:	Ed Grimm <paranoid@××××××××××××××××××××××.org>
To:
Cc:	gentoo-dev@l.g.o
Subject:	Re: [gentoo-dev] A few modest suggestions regarding tree size
Date:	Fri, 15 Oct 2004 00:42:19
Message-Id:	`Pine.LNX.4.58.0410141939150.21079@ybec.rq.iarg`
In Reply to:	Re: [gentoo-dev] A few modest suggestions regarding tree size by Luke-Jr

1	On Thu, 14 Oct 2004, Luke-Jr wrote:
2	> On Thursday 14 October 2004 4:35 pm, Roman Gaufman wrote:
3	>> On Thu, 14 Oct 2004 16:30:29 +0000, Luke-Jr <luke-jr@×××××××.org> wrote:
4	>>> On Thursday 14 October 2004 2:49 pm, Ciaran McCreesh wrote:
5	>>>> On Thu, 14 Oct 2004 07:43:11 -0700 Mark Dierolf <mark@×××.com> wrote:
6	>>>>\| I've been watching this discussion as far as tree size, and i'm
7	>>>>\| suprised nobody has brought the idea of on-demand downloading yet.
8
9	You should've watched closer. I did not mention true on-demand
10	downloading, because of having seen in the archive the last time it was
11	discussed and dismissed. I think that what I proposed would probably be
12	quicker than downloading the metadata, it doesn't deviate from the
13	concepts that have already made it into code so widely, and it hasn't
14	been rejected four times yet.
15
16	To reitterate, my idea was that you're probably most interested in the
17	packages you've already installed; so have an option to just sync
18	particular files, to complement the option of not syncing particular
19	files.
20
21	>> Huh? -- name 1 binary distribution that does that? -- all of the ones
22	>> I tried fetch a list of available packages -- which is exactly what
23	>> the portage tree provides.
24	>
25	> Why would they need a list of available packages? Such a list is
26	> useful only to the user. apt-get, ipkg, and urpmi are going to know
27	> the package name beforehand.
28
29	How do these programs accomplish that? They request a list of available
30	packages.
31
32	>>> On Thursday 14 October 2004 3:14 pm, Patrick Lauer wrote:
33	>>>> So you only have to rsync the dependency info. You save maybe 50%
34	>>>> traffic, but need some ebuild servers that will be hit by millions
35	>>>> of small requests for single ebuilds. No thanks.
36	>>>
37	>>> Actually, you don't even need to sync that. Simply download the
38	>>> primary ebuild, read the dep info, download the next one, etc. Most
39	>>> modern versions of file transfer protocols (HTTP and FTP, at least;
40	>>> don't know about rsync) support multiple transfers in a single
41	>>> connection.
42	>>
43	>> How would it know what ebuild to fetch exactly? --- just think about
44	>> that for a second.
45
46	The metadata files list dependancies, keywords, a description. It would
47	be technically feasible to do the dependancy evaluation and ebuild
48	selection for the entire ebuild session just using metadata, and have a
49	single medium rsync connection per emerge run. However, I couldn't code
50	it in Python, and I can't really explain it in English.
51
52	> ebuild doesn't deal with dependencys anyway, AFAIK. emerge would need
53	> the fetching functionality and could figure out the name based on
54	> (originally) the user's specification and (for deps) the DEPEND
55	> contents themselves. Portage already needs to know what the name of
56	> the package is anyway.
57
58	ebuild files are the ultimate source of the dependancy information. The
59	point on your side is that they're not the sole repository of same;
60	someone saw fit to export that data into cache files, so one could use
61	those cache files for your goal.
62
63	> On Thursday 14 October 2004 4:41 pm, Georgi Georgiev wrote:
64	>> the part where the http and ftp internals get handled by portage
65	>> internally, instead of handling them to an external program like
66	>> wget, are the reason why the idea was dismissed as unworkable several
67	>> times before.
68	>
69	> Not really a good excuse. HTTP isn't an overly complicated protocol.
70	> Including the fetching functionality also has other advantages, such
71	> as one less program to depend on (and thus one fewer that can be
72	> broken and screw up Portage).
73
74	I think the part that probably intimidates them is where we're
75	processing a particular list of stuff, and then we decide we want to get
76	more stuff. This basically requires explicit threading to pull it off
77	properly; it also requires a mindset that can deal with threading. As
78	someone with such a mindset, I can confidently say, no one writes that
79	kind of code without good cause. As an example, email servers could
80	definitely use this type of code, but most of them, including sendmail,
81	do not use it.
82
83
84	Luke, do you have the coding ability to write the changes that would be
85	required to get something like this to work? I ask, because I think
86	what would be needed for you to convince anyone would be a proof of
87	concept, which made at most one connection to a mirror. Until you have
88	such a thing, the prior ideas that have been discussed (which, despite
89	my having found the previous discussion, I did not find, as that was
90	another, "this has been discussed before" dismissal) are much firmer in
91	their minds than anything you are presenting, and I don't think you're
92	going to overcome that.
93
94	In any event, I think that you and I, and anyone else interested in
95	having this happen should get together off the list, outside of gentoo
96	discussion space. The idea is only partially formed, and none of the
97	devs are going to be convinced by anything less than a full plan that
98	addresses all of their concerns, although I think a working prototype
99	would be better. (You may think your idea is complete, but it could not
100	be coded simply on the ideas that have been discussed on this list over
101	the past couple of weeks. What we need is something thorough enough to
102	both build the code and demonstrate to all that it won't make the
103	infrastructure hurt. By the way, the only way to do that is to prove
104	that it will actually reduce infrastructure load.)
105
106	Ed
107
108	--
109	gentoo-dev@g.o mailing list

Report Message

Find on MARC Find on Google Groups