1 |
On Tuesday 12 October 2004 9:37 pm, Ciaran McCreesh wrote: |
2 |
> It has come to my attention that, during recent weeks, a small number of |
3 |
> users have been complaining recently about the size of the rsync tree. |
4 |
> My august colleagues have proposed many ingenious solutions, but |
5 |
> misfortunately they are all complicated and involve a lot of manual |
6 |
> work. I believe the following small changes (which can mostly be |
7 |
> automated) would prove of much larger benefit to the community for a |
8 |
> vastly reduced cost. |
9 |
|
10 |
The tree size is only a variable for users because they are required to have |
11 |
the entire tree on their computer. If Portage fetched these on-demand, it |
12 |
wouldn't be a problem. But that's already been discussed... |
13 |
|
14 |
> |
15 |
> To begin with, I'd like to draw your attention to comments in ebuilds. |
16 |
> It is an oft-forgotten fact that these items provide absolutely no |
17 |
> benefit to the end user. "Surely", I hear you say, "it is not worth |
18 |
> getting hung up over such an insignificant triviality! What harm do a |
19 |
> few trifling little remarks do?". Yet, when actually measured, these |
20 |
> 'innocent minutiae' (as you might call them had you a penchant for |
21 |
> obsolete vocabulary or a predilection for pomposity) account for |
22 |
> approximately 20% of the total ebuild content in the tree. It is obvious |
23 |
> that an immediate ban upon these silly things, alongside a small script |
24 |
> to remove them from the tree, would provide a very large gain for our |
25 |
> users without having to remove any existing code. Adding in a repoman |
26 |
> check to error out if such lines were present would clearly be a good |
27 |
> start. |
28 |
|
29 |
Even if they do take up a lot of space, they are often important so other |
30 |
devs/users can know why something was done. Perhaps the copyright comments |
31 |
should be removed, though, as copyrights exist with or without declaration of |
32 |
them. |
33 |
|
34 |
> |
35 |
> Next up are blank lines, which, as all the world knows are of no use at |
36 |
> all to anyone. These account for a staggering 150KBytes of data in the |
37 |
> main tree, which, over a 9600 dialup line, would save us over two |
38 |
> minutes on an emerge sync. Again, removing these pointless wastes of |
39 |
> space via a bash script is trivial. |
40 |
|
41 |
Blank lines are often nice for readability. 150K isn't much; if you're on |
42 |
dialup, two minutes is nothing, not to mention the fact that nobody should be |
43 |
on 9600 let alone 28.8... Also, that 2 minutes is assuming you're just |
44 |
downloading the tree. Rsync isn't a simple file transfer protocol. |
45 |
|
46 |
> |
47 |
> Staying with the blank spaces thing, leading whitespaces (which serve no |
48 |
> practical purpose and are only used to make the code "look pretty" -- |
49 |
> although how a bash script could ever be considered "pretty" is beyond |
50 |
> my limited mind) account for nearly half a megabyte of data. Clearly |
51 |
> these should immediately be removed and any developer using them in the |
52 |
> future should have their cvs access suspended pending a review of their |
53 |
> status within the project -- as devrel and our managers will tell you, |
54 |
> being nice to the users is our number one priority. |
55 |
|
56 |
Yet another readability issue. Half a meg isn't much. |
57 |
|
58 |
> |
59 |
> There are other trivial ways to save space too. The commonly used helper |
60 |
> function "emake", for example, is a shocking five bytes in length. |
61 |
> Replacing this with a much more helpfully named "e", and likewise |
62 |
> replacing "econf" with "c", would gain something like 50KBytes. If we |
63 |
> also replace src_unpack, src_compile and src_install with more |
64 |
> appropriate alternatives we could shave off a further 300KBytes. I have |
65 |
> no doubt that the reader could extend this logic to the other portage |
66 |
> internals and common function names, bring the total up to half a |
67 |
> megabyte or more. |
68 |
|
69 |
Developers are volunteers... are you seriously suggesting killing all |
70 |
readability just to make syncs a bit shorter? I wonder how many current |
71 |
volunteers would tolerate this. |
72 |
|
73 |
> |
74 |
> This can be extended to other functions, of course. In particular I'd |
75 |
> like to draw your attention to the absurdly named "flag-o-matic.eclass". |
76 |
> Merely inheriting this eclass adds at least thirteen bytes (that's over |
77 |
> a hundred bits!) of bloat to an ebuild, and that's before we start on |
78 |
> the ridiculously verbose function names. What's all this "replace-flags" |
79 |
> nonsense I ask you? Any educated programmer can see that "rf" is a far |
80 |
> more useful name. Even those who are not convinced that space needs to |
81 |
> be saved must surely notice how much developer time would be saved |
82 |
> through reduced typing. |
83 |
|
84 |
And annoyed by having to memorize what meaningless abbreviates mean... |
85 |
|
86 |
> |
87 |
> It remains a mystery to me how anyone could possibly have overlooked the |
88 |
> following suggestion. Currently, we install 'dependency information' |
89 |
> inside ebuilds. This is blatantly pointless -- as RedHat have so ably |
90 |
> demonstrated with their 'rpm' installer (and, albeit in a non-Linux |
91 |
> environment, I am assured that Microsoft are in the same boat), there is |
92 |
> no need for automatic dependency tracking and resolution. Our users are |
93 |
> more than capable of working this out for themselves. Similarly, the |
94 |
> HOMEPAGE variable is entirely pointless and has been supersede by Google |
95 |
> [1]. |
96 |
|
97 |
Ok, suggesting removal of dependency info has me convinced this is a bad |
98 |
joke... |
99 |
|
100 |
> |
101 |
> Oh, and then we come to metadata.xml. As all the world knows, xml is a |
102 |
> massive waste of space, and (as a data interchange format not a data |
103 |
> storage format) utterly unsuited for configuration files. A typical |
104 |
> metadata.xml file is 95%+ noise. By replacing these with flat text files |
105 |
> listing the maintainers, we could save somewhere in the region of one |
106 |
> and a half megabytes. |
107 |
|
108 |
And I'm sure rsync can probably filter out *.xml client-side... |
109 |
|
110 |
> |
111 |
> Also, no-one has yet considered all the useless fluff in the tree that |
112 |
> nobody actually uses. By removing all ebuilds and eclasses related to |
113 |
> emacs, kde, gnome, php, gaim or java related from the tree, as well as |
114 |
> <snip> |
115 |
|
116 |
No more reading. It's a joke. |
117 |
-- |
118 |
Luke-Jr |
119 |
Developer, Utopios |
120 |
http://utopios.org/ |
121 |
|
122 |
-- |
123 |
gentoo-dev@g.o mailing list |