Gentoo Archives: gentoo-soc

From:	Nirbheek Chauhan <nirbheek.chauhan@×××××.com>
To:	gentoo-soc@l.g.o
Subject:	[gentoo-soc] Progress report - AutotuA (formerly "Automate it All")
Date:	Mon, 16 Jun 2008 06:58:43
Message-Id:	`8b4c83ad0806152358ydc58abwe57a9e87bb342bc3@mail.gmail.com`

1	It's only been a little more than a week since I started working on
2	the project (due to personal reasons), and my time spent to work ratio
3	is extremely bad, so I'm sorry but the progress isn't as much as I had
4	hoped.
5
6	The idea has undergone significant changes in the time passed, and
7	thanks to Patrick's guidance (and constant cluebats), I now have a far
8	more clear-cut idea of how the whole thing will come together. I
9	wonder whether I should describe the project blueprint that we've come
10	up with, the path that led to it, or what all code I have written. I
11	suppose the progress of the code written cannot be judged unless one
12	knows the whole plan, and the path taken to come up with the plan is
13	largely irrelevant :)
14
15	The general idea has changed somewhat from the abstract:
16
17	As before, there is a master server which acts a storage area, manages
18	all the slaves and does various bookkeeping. This part will be written
19	in Django.
20
21	The concept of the slave has changed radically to allow for a less
22	steep learning curve. The project described "jobs" which consist of
23	executables stored on the master-server which could be fetched and run
24	by the slaves. We thought of ways in which we could describe
25	dependencies between the slaves, and the most obvious answer to me was
26	an XML format (much to the disgust of Patrick).
27
28	However, there were numerous problems with such an approach (least of
29	which was the overhead involved with parsing XML and the jing deps on
30	the Django side for it). The most serious of these was the fact that
31	learning a new XML format and writing custom executables (scripts or
32	otherwise) which communicate with the server via the Slave's bindings
33	has an extremely steep learning curve, and will cause chaos. The
34	project is useless if no one ends up using it, or it gets too
35	complicated to use.
36
37	The solution came to me in the form of a "Doh!" moment as I was
38	cycling back to my room. The answer was -- "jobuilds". Bash scripts
39	are easily adaptable, easy to understand and use (for Gentoo devs),
40	and their parsing is well-understood. For the second time in my life,
41	I appreciated the ingenuity of the inventors of the ebuild format.
42
43
44	Jobuilds:
45	----------
46	A jobuild is the smallest possible "quantum" of work. A job consists
47	of a root jobuild which has dependencies on other jobuilds, and all
48	these taken together form a job. The format of a jobuild is:
49	http://pastebin.osuosl.org/8355
50
51	- The four phases are to be run (by default) in the chroot where the
52	job will take place.
53	- SRC_URI are programs: test suites etc which are required by the
54	jobuild (does not include the deps which will be pulled in by emerge
55	in the chroot).
56	- PORTCONF_URI are tarballs which will contain portage config files
57	(/etc/portage/* /etc/make.conf etc)
58	- DEPEND are other jobuilds on which this jobuild _hard_ depends, ie
59	they must be completed in the same chroot (example: Test Amarok
60	depends on Build KDE which depends on Build X)
61	- SIDEPEND are SuperImpose Depends, all we need to know is that those
62	jobuilds completed successfully somewhere so that further
63	distribution of work is possible. (example: testing if all the
64	packages that import gnome2.eclass still work after some changes to
65	it)
66
67	SRC_URI will be downloaded before entering the chroot, stored in a
68	tarballs folder, and hardlinked (if on the same device), or bind
69	mounted inside the chroot.
70
71	To counter the problem of recursive QA checking, the jobuild format
72	will be extremely simple. That means no EAPI, no eclasses, no SLOTS,
73	minimal versioning (xxx.yyy), no fancy depends (except perhaps \|\|).
74	Built-in functions such as unpack() etc will be provided of course.
75
76	The loss of utility from there not being eclasses will be offset
77	through the concept of "Template jobuilds" (similar in concept to how
78	Django handles Templates[1]). However, I am open to including eclasses
79	in the design (who doesn't love them? :) if enough reasons can be
80	given.
81
82	NOTE: It will be highly recommended that the autotua work folder be on
83	the same device. I've assumed this to be true to allow a number of
84	optimisations, but I will keep (slower) fallbacks in case that is not
85	true.
86
87
88	The Tree:
89	-----------
90	Obviously the jobuilds will be stored in a structured format similar
91	to the portage tree :^)
92	And following the tradition of being completely unimaginative, it
93	shall be called the "Jobtage tree".
94	The structure is as follows:
95
96	${user}/
97	${user}.asc
98	${jobuild_name}/
99	${jobuild_name}-${ver}.jobuild
100	Manifest
101
102	The tree will be stored in bzr, with an overlays/ directory in .bzrignore
103	jobuilds will not be manifested, and will only be signed with the
104	maintainer's gpg key
105	SRC_URI and PORTCONF_URI will be Manifested (probably the same way in portage)
106
107	To further offset the problem of QA in this tree (mentioned in
108	"Jobuild" above), when Jobs are created/committed/uploaded on the
109	server (the details of that are in the next section), the whole
110	depgraph is validated, details about that stored as metadata, and the
111	Job itself is attached to that specific revision of the Jobtage
112	tree. This prevents breakages due to future changes made to the
113	jobuilds it depends on. If the maintainer wishes to update the
114	attached revision (for say a bugfix in a depending jobuild), he can
115	force a re-validation at anytime before a Job is accepted by a Slave.
116	Whenever a Slave accepts a Job, it syncs with the revision of the tree
117	it's attached to.
118
119	The other solution to this problem could've been to trigger a reverse
120	depgraph validation whenever a commit was made to the tree. The
121	problems with that approach are:
122	- Load on the server increases exponentially with jobuilds
123	- Raises the question of what the next action should be -- revert the
124	(potentially critical) commit or mark (potentially hundreds of)
125	jobuilds as broken?
126	- Makes Jobs fragile -- a job might be fine when you upload it, but
127	horribly broken 4 hours later.
128
129
130	Slaves:
131	--------
132	The slave can pull a list of Jobs that it can do from the master
133	server. A Job will consist of metadata about it:
134	http://pastebin.osuosl.org/8358 . The actual data is then gathered
135	from the jobuild(s), the chroot is prepared, etc etc and work begins.
136	Slave reports back to the master server after every jobuild is
137	complete with data and receives updates (if any) about the Job
138	(updates might consist of changing depends due to SIDEPENDs).
139
140	Obviously the Slave has to parse jobuilds. And so the concepts should
141	be similar to Portage. However, I am drawing inspiration from the
142	pkgcore[2] codebase, simplifying the extremely versatile code to suit
143	my needs (which is another reason for my slow progress -- it's not
144	easy to understand a work of art ;)
145
146
147	Actual Progress aka "No more hand-waving":
148	------------------------------------------------------
149	Now follows my real progress w.r.t the code.
150
151	I'm currently working on the slave, and am concentrating on the things
152	that don't depend on the part of parsing the jobuilds (have a general
153	idea how it's done, haven't fleshed out the details). Currently I've
154	implemented an OO interface (in Python of course) to a Job() object
155	accessed via Jobs(), a Syncer() object (jobtage), a Fetchable object
156	and a Fetcher (stage3 etc). Total code comes out to 167+70+38+30 =
157	~300 lines ;p
158
159	This week I'll start on chroot preparation and iron out the kinks in
160	that, followed by the Jobuild() object, the jobuild parser
161	(jobuild.sh), and the bridge connecting them. The #pkgcore guys are
162	really helpful and nice so I'll have good help for this part :)
163
164	Next week (end of the month) will (hopefully) see a working slave
165	which accepts Jobs from some magical source and runs them.
166
167	I'll begin work on the Master server the week after that, specifically
168	the backend work and the details of the communication between the
169	Master and Slaves. Frontend prettyfication will take place towards the
170	end.
171
172
173	1. http://www.djangobook.com/en/1.0/chapter04/ -- not the exact
174	format, only the idea of "Reverse Inheritance"
175	2. http://www.pkgcore.org/
176
177	PS: Another reason why progress is slow is because the Slave portion
178	has become much more sophisticated than what I had originally
179	intended. The original idea had (maintainer-made) executables doing
180	all the work (causing a steep learning curve) with the Slave just
181	being an API wrapper to talk to the master server. All of that work is
182	now shifted into the Slave and abstracted for the maintainer to use in
183	a familiar way.
184
185	--
186	~Nirbheek Chauhan
187	--
188	gentoo-soc@l.g.o mailing list

Replies

Subject	Author
[gentoo-soc] Re: Progress report - AutotuA (formerly "Automate it All")	Nirbheek Chauhan <nirbheek.chauhan@×××××.com>
Re: [gentoo-soc] Progress report - AutotuA (formerly "Automate it All")	Eric Thibodeau <kyron@××××××××.com>

Report Message

Find on MARC Find on Google Groups