hi guys, ive been browsing the gentoo-dev archives and i dont think this
idea has been suggested before, but apologies if i missed it.
any feedback, or flames, appreciated :)
basically, a modified distcc package is installed by users wishing to
participate in an internet-wide distcc network. Users run the modified
distccd which contacts a central server every 10 minutes while running,
indicating its GCC version, GLIBC version and Architecture.
Other information, eg geographic information based on timezone could
be sent, so that geogrpahically closer users are favoured when selecting
hosts. (This would improve build times instead of giving a british user
10 hosts on a 56k modem in australia)
The central server maintains a list of clients that are ready to
accept work, if no message is received after 10 minutes it is assumed
the client is no longer ready to accept work and is removed from the
pool. The server software could be purpose made, or a php/perl script
and a mysql database.
When a user wishes to compile something from portage using the
distributed network, they use their normal emerge command, prefixed
with `dportage` eg:
$ `dportage` emerge gnome
the dportage command contacts the remote server, and fetches a list
of 10 compatible machines ready to accept work, then prints a line
containing environment variables needed to use the network, similar
to how ssh-agent works. example output (this could be configurable):
$ dportage
DISTCC_HOSTS="123.123.123.123 123.123.123...." \
FEATURES="distcc" MAKEOPTS="-j12"
if the user isnt listed in the pool of users ready to accept work,
they are not permitted to use the network, and dportage prints
nothing. This prevents leeching.
The obvious issue is security, what stops a client returning work
that is trojaned, and might allow compromising the host?
The modified distcc client sends the work to two random DISTCC_HOSTS
and the returned work must match for it to be accepted. If it doesnt
match, the distcc client can resend the work to different hosts
and possibly submit the non-working hosts to a blacklist on the
central server, which will remove it from the pool of clients.
That could be easy to abuse, so the client must have been reported
to the blacklist by two different users for it to be removed.
Benefits
* Users who dont have a LAN get to benefit from distcc
* Faster compiles for all users
* Users can run the daemon nice, so as not to interfere with their work
* Easy to prevent leeching.
* Less complaints from users on slow machines who want binary
packages.
I've already made some tests on using distcc over the internet, it is
oviously slower than running it over a nice LAN, but it is still fast,
and im guessing that a 10 host DISTCC_HOSTS would negate this. The
client can obvioulsy be tweaked to timeout quicker, and maybe change the
order of DISTCC_HOSTS so as to favour the faster responding
clients...etc.
in case you arent familiar with it, http://distcc.samba.org/
--
-------------------------------------
taviso@... | finger me for my gpg key.
-------------------------------------------------------
--
gentoo-dev@g.o mailing list
|