Gentoo Archives: gentoo-dev

From: Ed Grimm <paranoid@××××××××××××××××××××××.org>
To: Luke-Jr <luke-jr@×××××××.org>
Cc: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] A few modest suggestions regarding tree size
Date: Wed, 13 Oct 2004 03:19:24
Message-Id: Pine.LNX.4.58.0410122115240.21079@ybec.rq.iarg
In Reply to: Re: [gentoo-dev] A few modest suggestions regarding tree size by Luke-Jr
1 On Tue, 12 Oct 2004, Luke-Jr wrote:
2
3 > On Tuesday 12 October 2004 9:37 pm, Ciaran McCreesh wrote:
4 >> It has come to my attention that, during recent weeks, a small number of
5 >> users have been complaining recently about the size of the rsync tree.
6 >> My august colleagues have proposed many ingenious solutions, but
7 >> misfortunately they are all complicated and involve a lot of manual
8 >> work. I believe the following small changes (which can mostly be
9 >> automated) would prove of much larger benefit to the community for a
10 >> vastly reduced cost.
11 >
12 > The tree size is only a variable for users because they are required to have
13 > the entire tree on their computer. If Portage fetched these on-demand, it
14 > wouldn't be a problem. But that's already been discussed...
15
16 I'm naive. Where/when? I see a bit of conversation last week.
17
18 My thoughts on this would be to have a RSYNC_INCLUDEFROM, which is used
19 except for once every RSYNC_EVERYTHING_INTERVAL syncs, when the whole
20 tree is synced. This would only be used if configured. If
21 RSYNC_AUTOINCLUDE is set, every new package installed, whether by
22 explicit merge or a dependancy merge, is also added to
23 RSYNC_INCLUDEFROM, and any package removed by unmerge or depclean is
24 removed from RSYNC_INCLUDEFROM.
25
26 Note that I realize this is not full on-demand fetching; if an
27 application is not listed, it's not fetched, most of the time. But it
28 does fetch only those applications demanded. I also know that changing
29 dependancies can cause it major problems. However, since my scheme does
30 have it doing occasional full syncs (for example, once a week, Saturday
31 evening), this problem should be reasonably mitigated - if ever a
32 program gains a dependancy that has not yet existed for a week, you
33 don't want it running on a production server anyway.
34
35 This should satisfy those people inclined to run underpowered[1] production
36 systems with automated syncs - after all, on a production system, you
37 better know all the packages you care about. It also doesn't kill the
38 sync server with N connections. I personally would code this feature so
39 that it would enforce a minimum of 24 hours between syncs (by activating
40 a separate feature, so that others could use it also); a real enterprise
41 server, where stability trumps latest-and-greatest, wouldn't be updating
42 itself that often[2].
43
44
45 While I am not a Python programmer, I could try my hand at a
46 proof-of-concept if the rest of the underpowered coalition is
47 interested, but interested Python programmers don't exist. Chances are
48 good anything I'd code would work, but would be rejected by any python
49 coders as looking like some mad monkey took bits and pieces of code from
50 elsewhere in the program, put them together in a manner that somehow
51 worked, and then tried to make it purty; this is not due to my not being
52 a decent programmer, but rather only knowing python uses indentation
53 instead of curlies, and it somehow lives without line end characters.
54 (Well, ok, I know a bit more, but everything else I know about it comes
55 from looking at portage code.)
56
57 Ed
58
59 [1] Note to anyone who feels annoyed that I'm calling their production
60 systems underpowered: If your system doesn't have enough spare resources
61 to the extent that an emerge sync during an idle time takes 5 times
62 longer than if the server tasks were turned off, it doesn't have enough
63 spare resources to handle a real peak either. I understand you may not
64 be able to afford better. But one should be realistic about what one is
65 running. Of course, if your production system does an emerge sync in 15
66 seconds during peak load, well, I wonder why you're reading this, as
67 your system's apparently ludicrously overpowered. And I want some of
68 your bandwidth.
69
70 [2] I generally find security by design greatly reduces the
71 vulnerability to threats. Not to mention, my inclination would be to
72 load the update on a test system, regression test it and fix test it if
73 possible, and then force install the binary package created when loading
74 it on the test box to all production machines. If you have more than a
75 handfull of machines, I'd think some software aid for this would be in
76 order. Shouldn't be too hard; I can think of several models that should
77 work for low or medium security environments. If you have a high
78 security environment, don't even try to tell me why you're running
79 automatic updates without testing from the Internet. (After all, you
80 may trust gentoo, but they don't own all of their mirrors.)
81
82 --
83 gentoo-dev@g.o mailing list