1 |
It's only been a little more than a week since I started working on |
2 |
the project (due to personal reasons), and my time spent to work ratio |
3 |
is extremely bad, so I'm sorry but the progress isn't as much as I had |
4 |
hoped. |
5 |
|
6 |
The idea has undergone significant changes in the time passed, and |
7 |
thanks to Patrick's guidance (and constant cluebats), I now have a far |
8 |
more clear-cut idea of how the whole thing will come together. I |
9 |
wonder whether I should describe the project blueprint that we've come |
10 |
up with, the path that led to it, or what all code I have written. I |
11 |
suppose the progress of the code written cannot be judged unless one |
12 |
knows the whole plan, and the path taken to come up with the plan is |
13 |
largely irrelevant :) |
14 |
|
15 |
The general idea has changed somewhat from the abstract: |
16 |
|
17 |
As before, there is a master server which acts a storage area, manages |
18 |
all the slaves and does various bookkeeping. This part will be written |
19 |
in Django. |
20 |
|
21 |
The concept of the slave has changed radically to allow for a less |
22 |
steep learning curve. The project described "jobs" which consist of |
23 |
executables stored on the master-server which could be fetched and run |
24 |
by the slaves. We thought of ways in which we could describe |
25 |
dependencies between the slaves, and the most obvious answer to me was |
26 |
an XML format (much to the disgust of Patrick). |
27 |
|
28 |
However, there were numerous problems with such an approach (least of |
29 |
which was the overhead involved with parsing XML and the jing deps on |
30 |
the Django side for it). The most serious of these was the fact that |
31 |
learning a new XML format and writing custom executables (scripts or |
32 |
otherwise) which communicate with the server via the Slave's bindings |
33 |
has an *extremely* steep learning curve, and will cause chaos. The |
34 |
project is useless if no one ends up using it, or it gets too |
35 |
complicated to use. |
36 |
|
37 |
The solution came to me in the form of a "Doh!" moment as I was |
38 |
cycling back to my room. The answer was -- "jobuilds". Bash scripts |
39 |
are easily adaptable, easy to understand and use (for Gentoo devs), |
40 |
and their parsing is well-understood. For the second time in my life, |
41 |
I appreciated the ingenuity of the inventors of the ebuild format. |
42 |
|
43 |
|
44 |
Jobuilds: |
45 |
---------- |
46 |
A jobuild is the smallest possible "quantum" of work. A job consists |
47 |
of a root jobuild which has dependencies on other jobuilds, and all |
48 |
these taken together form a job. The format of a jobuild is: |
49 |
http://pastebin.osuosl.org/8355 |
50 |
|
51 |
- The four phases are to be run (by default) in the chroot where the |
52 |
job will take place. |
53 |
- SRC_URI are programs: test suites etc which are required by the |
54 |
jobuild (does not include the deps which will be pulled in by emerge |
55 |
in the chroot). |
56 |
- PORTCONF_URI are tarballs which will contain portage config files |
57 |
(/etc/portage/* /etc/make.conf etc) |
58 |
- DEPEND are other jobuilds on which this jobuild _hard_ depends, ie |
59 |
they must be completed in the same chroot (example: Test Amarok |
60 |
depends on Build KDE which depends on Build X) |
61 |
- SIDEPEND are SuperImpose Depends, all we need to know is that those |
62 |
jobuilds completed successfully *somewhere* so that further |
63 |
distribution of work is possible. (example: testing if all the |
64 |
packages that import gnome2.eclass still work after some changes to |
65 |
it) |
66 |
|
67 |
SRC_URI will be downloaded before entering the chroot, stored in a |
68 |
tarballs folder, and hardlinked (if on the same device), or bind |
69 |
mounted inside the chroot. |
70 |
|
71 |
To counter the problem of recursive QA checking, the jobuild format |
72 |
will be *extremely* simple. That means no EAPI, no eclasses, no SLOTS, |
73 |
minimal versioning (xxx.yyy), no fancy depends (except perhaps ||). |
74 |
Built-in functions such as unpack() etc will be provided of course. |
75 |
|
76 |
The loss of utility from there not being eclasses will be offset |
77 |
through the concept of "Template jobuilds" (similar in concept to how |
78 |
Django handles Templates[1]). However, I am open to including eclasses |
79 |
in the design (who doesn't love them? :) if enough reasons can be |
80 |
given. |
81 |
|
82 |
NOTE: It will be highly recommended that the autotua work folder be on |
83 |
the same device. I've assumed this to be true to allow a number of |
84 |
optimisations, but I will keep (slower) fallbacks in case that is not |
85 |
true. |
86 |
|
87 |
|
88 |
The Tree: |
89 |
----------- |
90 |
Obviously the jobuilds will be stored in a structured format similar |
91 |
to the portage tree :^) |
92 |
And following the tradition of being completely unimaginative, it |
93 |
shall be called the "Jobtage tree". |
94 |
The structure is as follows: |
95 |
|
96 |
${user}/ |
97 |
${user}.asc |
98 |
${jobuild_name}/ |
99 |
${jobuild_name}-${ver}.jobuild |
100 |
Manifest |
101 |
|
102 |
The tree will be stored in bzr, with an overlays/ directory in .bzrignore |
103 |
jobuilds will not be manifested, and will only be signed with the |
104 |
maintainer's gpg key |
105 |
SRC_URI and PORTCONF_URI will be Manifested (probably the same way in portage) |
106 |
|
107 |
To further offset the problem of QA in this tree (mentioned in |
108 |
"Jobuild" above), when Jobs are created/committed/uploaded on the |
109 |
server (the details of that are in the next section), the whole |
110 |
depgraph is validated, details about that stored as metadata, and the |
111 |
Job itself is attached to **that specific revision** of the Jobtage |
112 |
tree. This prevents breakages due to future changes made to the |
113 |
jobuilds it depends on. If the maintainer wishes to update the |
114 |
attached revision (for say a bugfix in a depending jobuild), he can |
115 |
force a re-validation at anytime before a Job is accepted by a Slave. |
116 |
Whenever a Slave accepts a Job, it syncs with the revision of the tree |
117 |
it's attached to. |
118 |
|
119 |
The other solution to this problem could've been to trigger a reverse |
120 |
depgraph validation whenever a commit was made to the tree. The |
121 |
problems with that approach are: |
122 |
- Load on the server increases exponentially with jobuilds |
123 |
- Raises the question of what the next action should be -- revert the |
124 |
(potentially critical) commit or mark (potentially hundreds of) |
125 |
jobuilds as broken? |
126 |
- Makes Jobs fragile -- a job might be fine when you upload it, but |
127 |
horribly broken 4 hours later. |
128 |
|
129 |
|
130 |
Slaves: |
131 |
-------- |
132 |
The slave can pull a list of Jobs that it can do from the master |
133 |
server. A Job will consist of metadata about it: |
134 |
http://pastebin.osuosl.org/8358 . The actual data is then gathered |
135 |
from the jobuild(s), the chroot is prepared, etc etc and work begins. |
136 |
Slave reports back to the master server after every jobuild is |
137 |
complete with data and receives updates (if any) about the Job |
138 |
(updates might consist of changing depends due to SIDEPENDs). |
139 |
|
140 |
Obviously the Slave has to parse jobuilds. And so the concepts should |
141 |
be similar to Portage. However, I am drawing inspiration from the |
142 |
pkgcore[2] codebase, simplifying the extremely versatile code to suit |
143 |
my needs (which is another reason for my slow progress -- it's not |
144 |
easy to understand a work of art ;) |
145 |
|
146 |
|
147 |
Actual Progress aka "No more hand-waving": |
148 |
------------------------------------------------------ |
149 |
Now follows my *real* progress w.r.t the code. |
150 |
|
151 |
I'm currently working on the slave, and am concentrating on the things |
152 |
that don't depend on the part of parsing the jobuilds (have a general |
153 |
idea how it's done, haven't fleshed out the details). Currently I've |
154 |
implemented an OO interface (in Python of course) to a Job() object |
155 |
accessed via Jobs(), a Syncer() object (jobtage), a Fetchable object |
156 |
and a Fetcher (stage3 etc). Total code comes out to 167+70+38+30 = |
157 |
~300 lines ;p |
158 |
|
159 |
This week I'll start on chroot preparation and iron out the kinks in |
160 |
that, followed by the Jobuild() object, the jobuild parser |
161 |
(jobuild.sh), and the bridge connecting them. The #pkgcore guys are |
162 |
really helpful and nice so I'll have good help for this part :) |
163 |
|
164 |
Next week (end of the month) will (hopefully) see a working slave |
165 |
which accepts Jobs from some magical source and runs them. |
166 |
|
167 |
I'll begin work on the Master server the week after that, specifically |
168 |
the backend work and the details of the communication between the |
169 |
Master and Slaves. Frontend prettyfication will take place towards the |
170 |
end. |
171 |
|
172 |
|
173 |
1. http://www.djangobook.com/en/1.0/chapter04/ -- not the exact |
174 |
format, only the idea of "Reverse Inheritance" |
175 |
2. http://www.pkgcore.org/ |
176 |
|
177 |
PS: Another reason why progress is slow is because the Slave portion |
178 |
has become much more sophisticated than what I had originally |
179 |
intended. The original idea had (maintainer-made) executables doing |
180 |
all the work (causing a steep learning curve) with the Slave just |
181 |
being an API wrapper to talk to the master server. All of that work is |
182 |
now shifted into the Slave and abstracted for the maintainer to use in |
183 |
a familiar way. |
184 |
|
185 |
-- |
186 |
~Nirbheek Chauhan |
187 |
-- |
188 |
gentoo-soc@l.g.o mailing list |