Gentoo Logo
Gentoo Spaceship




Note: Due to technical difficulties, the Archives are currently not up to date. GMANE provides an alternative service for most mailing lists.
c.f. bug 424647
List Archive: gentoo-dev
Navigation:
Lists: gentoo-dev: < Prev By Thread Next > < Prev By Date Next >
Headers:
To: Luke-Jr <luke-jr@...>
From: Ed Grimm <paranoid@...>
Subject: Re: A few modest suggestions regarding tree size
Date: Tue, 12 Oct 2004 22:17:10 -0500 (EST)
On Tue, 12 Oct 2004, Luke-Jr wrote:

> On Tuesday 12 October 2004 9:37 pm, Ciaran McCreesh wrote:
>> It has come to my attention that, during recent weeks, a small number of
>> users have been complaining recently about the size of the rsync tree.
>> My august colleagues have proposed many ingenious solutions, but
>> misfortunately they are all complicated and involve a lot of manual
>> work. I believe the following small changes (which can mostly be
>> automated) would prove of much larger benefit to the community for a
>> vastly reduced cost.
>
> The tree size is only a variable for users because they are required to have
> the entire tree on their computer. If Portage fetched these on-demand, it
> wouldn't be a problem. But that's already been discussed...

I'm naive.  Where/when?  I see a bit of conversation last week.

My thoughts on this would be to have a RSYNC_INCLUDEFROM, which is used
except for once every RSYNC_EVERYTHING_INTERVAL syncs, when the whole
tree is synced.  This would only be used if configured.  If
RSYNC_AUTOINCLUDE is set, every new package installed, whether by
explicit merge or a dependancy merge, is also added to
RSYNC_INCLUDEFROM, and any package removed by unmerge or depclean is
removed from RSYNC_INCLUDEFROM.

Note that I realize this is not full on-demand fetching; if an
application is not listed, it's not fetched, most of the time.  But it
does fetch only those applications demanded.  I also know that changing
dependancies can cause it major problems.  However, since my scheme does
have it doing occasional full syncs (for example, once a week, Saturday
evening), this problem should be reasonably mitigated - if ever a
program gains a dependancy that has not yet existed for a week, you
don't want it running on a production server anyway.

This should satisfy those people inclined to run underpowered[1] production
systems with automated syncs - after all, on a production system, you
better know all the packages you care about.  It also doesn't kill the
sync server with N connections.  I personally would code this feature so
that it would enforce a minimum of 24 hours between syncs (by activating
a separate feature, so that others could use it also); a real enterprise
server, where stability trumps latest-and-greatest, wouldn't be updating
itself that often[2].


While I am not a Python programmer, I could try my hand at a
proof-of-concept if the rest of the underpowered coalition is
interested, but interested Python programmers don't exist.  Chances are
good anything I'd code would work, but would be rejected by any python
coders as looking like some mad monkey took bits and pieces of code from
elsewhere in the program, put them together in a manner that somehow
worked, and then tried to make it purty; this is not due to my not being
a decent programmer, but rather only knowing python uses indentation
instead of curlies, and it somehow lives without line end characters.
(Well, ok, I know a bit more, but everything else I know about it comes
from looking at portage code.)

Ed

[1] Note to anyone who feels annoyed that I'm calling their production
systems underpowered: If your system doesn't have enough spare resources
to the extent that an emerge sync during an idle time takes 5 times
longer than if the server tasks were turned off, it doesn't have enough
spare resources to handle a real peak either.  I understand you may not
be able to afford better.  But one should be realistic about what one is
running.  Of course, if your production system does an emerge sync in 15
seconds during peak load, well, I wonder why you're reading this, as
your system's apparently ludicrously overpowered.  And I want some of
your bandwidth.

[2] I generally find security by design greatly reduces the
vulnerability to threats.  Not to mention, my inclination would be to
load the update on a test system, regression test it and fix test it if
possible, and then force install the binary package created when loading
it on the test box to all production machines.  If you have more than a
handfull of machines, I'd think some software aid for this would be in
order.  Shouldn't be too hard; I can think of several models that should
work for low or medium security environments.  If you have a high
security environment, don't even try to tell me why you're running
automatic updates without testing from the Internet.  (After all, you
may trust gentoo, but they don't own all of their mirrors.)

--
gentoo-dev@g.o mailing list

References:
A few modest suggestions regarding tree size
-- Ciaran McCreesh
Re: A few modest suggestions regarding tree size
-- Luke-Jr
Navigation:
Lists: gentoo-dev: < Prev By Thread Next > < Prev By Date Next >
Previous by thread:
Re: A few modest suggestions regarding tree size
Next by thread:
Re: A few modest suggestions regarding tree size
Previous by date:
Re: A few modest suggestions regarding tree size
Next by date:
Re: A few modest suggestions regarding tree size


Updated Mar 27, 2012

Summary: Archive of the gentoo-dev mailing list.

Donate to support our development efforts.

Copyright 2001-2013 Gentoo Foundation, Inc. Questions, Comments? Contact us.