Gentoo Archives: gentoo-soc

From:	Jeremy Olexa <darkside@g.o>
To:	gentoo-soc@l.g.o
Subject:	Re: [gentoo-soc] [GSoC-status] Tree-wide collision checking and files database
Date:	Fri, 12 Jun 2009 15:05:44
Message-Id:	`90b936c0906120805m18cc4f55j3b33d0d17d855970@mail.gmail.com`
In Reply to:	[gentoo-soc] [GSoC-status] Tree-wide collision checking and files database by Stanislav Ochotnicky

1	On Fri, Jun 12, 2009 at 8:32 AM, Stanislav
2	Ochotnicky<sochotnicky@×××××.com> wrote:
3	> Hi everyone,
4	>
5	> some of you already know that work on GSoC project "Tree-wide collision
6	> checking and provided files database" has been started a few weeks ago.
7	> For the rest, I will make a short introduction and goals of this
8	> project (collagen).
9	>
10	> Collagen aims to improve quality of ebuilds in portage tree. It does
11	> this by compiling as many ebuilds as possible. It specifically takes
12	> into account various atoms in DEPEND variable. For example if package
13	> ebuild states that it needs =dev-libs/glib-2*, that package should be
14	> compilable with every version of glib-2* in portage (taking into account
15	> keywords). Therefore collagen will install one version of glib-2*, then
16	> ebuild in question, collect information, uninstall ebuild and first
17	> glib version. If repeats this process for every glib-2* in the tree.
18
19	Testing against every version of the deps as required seems like it is
20	diverging from the original "Tree-wide collision checking and provided
21	files database" - Would you say that the goal of this project is
22	becoming more QA orientated? Something like: "Matchbox: A tinderboxen
23	master server to provide QA for ebuilds"
24
25	If you were strictly collision checking, then you don't care about
26	every version of glib-2* you only care about the package in question
27	and what installed files it provides. However for the provided files,
28	you do care about every version of glib-2*, not for the other package,
29	but to list the installed files of glib-2*
30
31	After writing that down, I can see why you want to compile, check,
32	uninstall, re-compile, repeat...but I worry about how efficient it is
33	and what ways to improve that.
34
35	>
36	> Original idea was to have two sides:
37	> * master server (matchbox)
38	> * slaves compiling packages (tinderboxes)
39	>
40	> Master server decides what needs to be compiled (automatically or
41	> semi-automatically). Tinderbox asks for job, master provides package
42	> name (and optionally version). Tinderbox then goes and tries to compile
43	> package with different sets of dependencies reporting results to
44	> Matchbox.
45	>
46	> It seems that whole process could be sped up by hosting binary
47	> packages on one central server (Binary host). Obviously various versions
48	> of the same package would be created and therefore unique names could be
49	> created by using some metadata to create hash part of filename. On a
50	> first thought I would use USE flags and DEPEND as metadata to hash.
51
52	This is a cool aspect of the project, I hope you can work with solar
53	and zmedico to improve binpkgs. USE flags seem to be the trouble spot
54	of binpkgs.
55
56	>
57	> So far two other projects came to light as possible source of
58	> inspiration and/or collaboration:
59	> * catalyst (mainly tinderbox generating part)
60	> * AutotuA (automatic generic job framework)
61	>
62	> Especially AutotuA seems like good candidate for merging.
63	>
64	> It doesn't seem possible to compile every project with every version of
65	> every dependency, therefore I'd like to ask for your opinion especially
66	> about this part. One idea I had was to restrict testing to highest build
67	> number for given version. For example we have:
68	> glib-2.18.4-r1 and glib-2.18.4-r2, therefore we will only test against
69	> glib-2.18.4-r2 and will assume that r1 would be OK too (or users would
70	> upgrade since it's a bugfix release)
71
72	IMO, you have two choices. Latest stable or latest ~arch. Stable users
73	will not upgrade from glib-2.18.4-r1 to -r2 until -r2 is stable so
74	that argument is out.
75
76	>
77	> Another approach to optimizing use of resources would be to have a
78	> priority list of packages that need most testing. I imagine this could
79	> be created by analyzing logs from gentoo mirrors, and figuring out which
80	> packages are downloaded most frequently.
81
82	Mirror log analysis is a fundamentally hard thing to do given the vast
83	network of mirrors that we have.
84
85	>
86	> We would probably need at least one tinderbox per glibc version if I am
87	> not mistaken since this cannot be freely up/downgraded.
88
89	Its free to upgrade ;) Can't downgrade. Given how large the glibc
90	tracker bugs get, I don't think this project should use the latest
91	glibc available. Unless you are trying to hunt down bugs, but I think
92	you will get buried with compile failures. If the goal of this project
93	is to data mine the installed package's information, that is not
94	dependant on a glibc version. Please think about this some more before
95	going down that road, I want this project to be successful ;)
96
97	-Jeremy
98
99
100	> This email was meant just as a teaser, more information (data model, UML
101	> diagrams) is available on project website (look for Documents):
102	> http://soc.gentooexperimental.org/projects/show/collision-database
103	>
104	> I'd love to be hear some suggestions, opinions and criticism. You can
105	> use this thread, or even various options on gentooexperimental.org.
106	>
107	> --
108	> Stanislav Ochotnicky
109	> Working for Gentoo Linux http://www.gentoo.org
110	> Implementing Tree-wide collision checking and provided files database
111	> http://soc.gentooexperimental.org/projects/show/collision-database
112	> Blog: http://inputvalidation.blogspot.com/search/label/gsoc
113	>
114	>
115	> jabber: sochotnicky@×××××.com
116	> icq: 74274152
117	> PGP: https://dl.getdropbox.com/u/165616/sochotnicky-key.asc
118	>

Replies

Subject	Author
Re: [gentoo-soc] [GSoC-status] Tree-wide collision checking and files database	Eitan Mosenkis <eitan@××××××××.net>
Re: [gentoo-soc] [GSoC-status] Tree-wide collision checking and files database	Stanislav Ochotnicky <sochotnicky@×××××.com>
Re: [gentoo-soc] [GSoC-status] Tree-wide collision checking and files database	Arne Babenhauserheide <arne_bab@×××.de>

Report Message

Find on MARC Find on Google Groups