Gentoo Archives: gentoo-dev

From: taviso <taviso@××××××××××××.org>
To: gentoo-dev@g.o
Subject: [gentoo-dev] opt-in distributed portage network
Date: Mon, 14 Apr 2003 10:47:20
hi guys, ive been browsing the gentoo-dev archives and i dont think this
idea has been suggested before, but apologies if i missed it.

any feedback, or flames, appreciated :)

basically, a modified distcc package is installed by users wishing to 
participate in an internet-wide distcc network. Users run the modified 
distccd which contacts a central server every 10 minutes while running, 
indicating its GCC version, GLIBC version and Architecture.

Other information, eg geographic information based on timezone could
be sent, so that geogrpahically closer users are favoured when selecting
hosts. (This would improve build times instead of giving a british user
10 hosts on a 56k modem in australia)

The central server maintains a list of clients that are ready to 
accept work, if no message is received after 10 minutes it is assumed
the client is no longer ready to accept work and is removed from the 
pool. The server software could be purpose made, or a php/perl script
and a mysql database.

When a user wishes to compile something from portage using the 
distributed network, they use their normal emerge command, prefixed 
with `dportage` eg:

	$ `dportage` emerge gnome

the dportage command contacts the remote server, and fetches a list
of 10 compatible machines ready to accept work, then prints a line 
containing environment variables needed to use the network, similar
to how ssh-agent works. example output (this could be configurable):
	$ dportage
	DISTCC_HOSTS=" 123.123.123...." \
	FEATURES="distcc" MAKEOPTS="-j12"
if the user isnt listed in the pool of users ready to accept work,
they are not permitted to use the network, and dportage prints
nothing. This prevents leeching.

The obvious issue is security, what stops a client returning work
that is trojaned, and might allow compromising the host?

The modified distcc client sends the work to two random DISTCC_HOSTS
and the returned work must match for it to be accepted. If it doesnt 
match, the distcc client can resend the work to different hosts 
and possibly submit the non-working hosts to a blacklist on the
central server, which will remove it from the pool of clients.

That could be easy to abuse, so the client must have been reported 
to the blacklist by two different users for it to be removed.

* Users who dont have a LAN get to benefit from distcc
* Faster compiles for all users
* Users can run the daemon nice, so as not to interfere with their work
* Easy to prevent leeching.
* Less complaints from users on slow machines who want binary

I've already made some tests on using distcc over the internet, it is
oviously slower than running it over a nice LAN, but it is still fast,
and im guessing that a 10 host DISTCC_HOSTS would negate this. The
client can obvioulsy be tweaked to timeout quicker, and maybe change the 
order of DISTCC_HOSTS so as to favour the faster responding

in case you arent familiar with it,

taviso@××××××××××××.org | finger me for my gpg key.

gentoo-dev@g.o mailing list


Subject Author
Re: [gentoo-dev] opt-in distributed portage network scrllock@××××××××××.com