Gentoo Archives: gentoo-dev

From: Luke-Jr <luke-jr@×××××××.org>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] A few modest suggestions regarding tree size
Date: Tue, 12 Oct 2004 22:11:55
Message-Id: 200410122211.54834.luke-jr@utopios.org
In Reply to: [gentoo-dev] A few modest suggestions regarding tree size by Ciaran McCreesh
1 On Tuesday 12 October 2004 9:37 pm, Ciaran McCreesh wrote:
2 > It has come to my attention that, during recent weeks, a small number of
3 > users have been complaining recently about the size of the rsync tree.
4 > My august colleagues have proposed many ingenious solutions, but
5 > misfortunately they are all complicated and involve a lot of manual
6 > work. I believe the following small changes (which can mostly be
7 > automated) would prove of much larger benefit to the community for a
8 > vastly reduced cost.
9
10 The tree size is only a variable for users because they are required to have
11 the entire tree on their computer. If Portage fetched these on-demand, it
12 wouldn't be a problem. But that's already been discussed...
13
14 >
15 > To begin with, I'd like to draw your attention to comments in ebuilds.
16 > It is an oft-forgotten fact that these items provide absolutely no
17 > benefit to the end user. "Surely", I hear you say, "it is not worth
18 > getting hung up over such an insignificant triviality! What harm do a
19 > few trifling little remarks do?". Yet, when actually measured, these
20 > 'innocent minutiae' (as you might call them had you a penchant for
21 > obsolete vocabulary or a predilection for pomposity) account for
22 > approximately 20% of the total ebuild content in the tree. It is obvious
23 > that an immediate ban upon these silly things, alongside a small script
24 > to remove them from the tree, would provide a very large gain for our
25 > users without having to remove any existing code. Adding in a repoman
26 > check to error out if such lines were present would clearly be a good
27 > start.
28
29 Even if they do take up a lot of space, they are often important so other
30 devs/users can know why something was done. Perhaps the copyright comments
31 should be removed, though, as copyrights exist with or without declaration of
32 them.
33
34 >
35 > Next up are blank lines, which, as all the world knows are of no use at
36 > all to anyone. These account for a staggering 150KBytes of data in the
37 > main tree, which, over a 9600 dialup line, would save us over two
38 > minutes on an emerge sync. Again, removing these pointless wastes of
39 > space via a bash script is trivial.
40
41 Blank lines are often nice for readability. 150K isn't much; if you're on
42 dialup, two minutes is nothing, not to mention the fact that nobody should be
43 on 9600 let alone 28.8... Also, that 2 minutes is assuming you're just
44 downloading the tree. Rsync isn't a simple file transfer protocol.
45
46 >
47 > Staying with the blank spaces thing, leading whitespaces (which serve no
48 > practical purpose and are only used to make the code "look pretty" --
49 > although how a bash script could ever be considered "pretty" is beyond
50 > my limited mind) account for nearly half a megabyte of data. Clearly
51 > these should immediately be removed and any developer using them in the
52 > future should have their cvs access suspended pending a review of their
53 > status within the project -- as devrel and our managers will tell you,
54 > being nice to the users is our number one priority.
55
56 Yet another readability issue. Half a meg isn't much.
57
58 >
59 > There are other trivial ways to save space too. The commonly used helper
60 > function "emake", for example, is a shocking five bytes in length.
61 > Replacing this with a much more helpfully named "e", and likewise
62 > replacing "econf" with "c", would gain something like 50KBytes. If we
63 > also replace src_unpack, src_compile and src_install with more
64 > appropriate alternatives we could shave off a further 300KBytes. I have
65 > no doubt that the reader could extend this logic to the other portage
66 > internals and common function names, bring the total up to half a
67 > megabyte or more.
68
69 Developers are volunteers... are you seriously suggesting killing all
70 readability just to make syncs a bit shorter? I wonder how many current
71 volunteers would tolerate this.
72
73 >
74 > This can be extended to other functions, of course. In particular I'd
75 > like to draw your attention to the absurdly named "flag-o-matic.eclass".
76 > Merely inheriting this eclass adds at least thirteen bytes (that's over
77 > a hundred bits!) of bloat to an ebuild, and that's before we start on
78 > the ridiculously verbose function names. What's all this "replace-flags"
79 > nonsense I ask you? Any educated programmer can see that "rf" is a far
80 > more useful name. Even those who are not convinced that space needs to
81 > be saved must surely notice how much developer time would be saved
82 > through reduced typing.
83
84 And annoyed by having to memorize what meaningless abbreviates mean...
85
86 >
87 > It remains a mystery to me how anyone could possibly have overlooked the
88 > following suggestion. Currently, we install 'dependency information'
89 > inside ebuilds. This is blatantly pointless -- as RedHat have so ably
90 > demonstrated with their 'rpm' installer (and, albeit in a non-Linux
91 > environment, I am assured that Microsoft are in the same boat), there is
92 > no need for automatic dependency tracking and resolution. Our users are
93 > more than capable of working this out for themselves. Similarly, the
94 > HOMEPAGE variable is entirely pointless and has been supersede by Google
95 > [1].
96
97 Ok, suggesting removal of dependency info has me convinced this is a bad
98 joke...
99
100 >
101 > Oh, and then we come to metadata.xml. As all the world knows, xml is a
102 > massive waste of space, and (as a data interchange format not a data
103 > storage format) utterly unsuited for configuration files. A typical
104 > metadata.xml file is 95%+ noise. By replacing these with flat text files
105 > listing the maintainers, we could save somewhere in the region of one
106 > and a half megabytes.
107
108 And I'm sure rsync can probably filter out *.xml client-side...
109
110 >
111 > Also, no-one has yet considered all the useless fluff in the tree that
112 > nobody actually uses. By removing all ebuilds and eclasses related to
113 > emacs, kde, gnome, php, gaim or java related from the tree, as well as
114 > <snip>
115
116 No more reading. It's a joke.
117 --
118 Luke-Jr
119 Developer, Utopios
120 http://utopios.org/
121
122 --
123 gentoo-dev@g.o mailing list

Replies

Subject Author
Re: [gentoo-dev] A few modest suggestions regarding tree size Colin Kingsley <ckingsley@×××××.com>
Re: [gentoo-dev] A few modest suggestions regarding tree size Alan Frazier <afrazier2@×××××.com>
Re: [gentoo-dev] A few modest suggestions regarding tree size Ed Grimm <paranoid@××××××××××××××××××××××.org>