1 |
I think its wrong for a maintainer to troll like this on the dev |
2 |
mailing list -- but then again its ciaranm - he's a joker :) |
3 |
|
4 |
But seriously, there were some nice ideas in the last few messages, |
5 |
and it wasnt 50Kb improvements, but a major drop in file number and |
6 |
size, like 25% drop, thats pretty major -- also, it doesnt affect |
7 |
developers much in terms of productivity. |
8 |
|
9 |
Including my idea of syncing based on a changelog, that will reduce |
10 |
time to literally less than a second without affecting ebuild |
11 |
maintainers at all. |
12 |
|
13 |
On Tue, 12 Oct 2004 22:37:25 +0100, Ciaran McCreesh <ciaranm@g.o> wrote: |
14 |
> It has come to my attention that, during recent weeks, a small number of |
15 |
> users have been complaining recently about the size of the rsync tree. |
16 |
> My august colleagues have proposed many ingenious solutions, but |
17 |
> misfortunately they are all complicated and involve a lot of manual |
18 |
> work. I believe the following small changes (which can mostly be |
19 |
> automated) would prove of much larger benefit to the community for a |
20 |
> vastly reduced cost. |
21 |
> |
22 |
> To begin with, I'd like to draw your attention to comments in ebuilds. |
23 |
> It is an oft-forgotten fact that these items provide absolutely no |
24 |
> benefit to the end user. "Surely", I hear you say, "it is not worth |
25 |
> getting hung up over such an insignificant triviality! What harm do a |
26 |
> few trifling little remarks do?". Yet, when actually measured, these |
27 |
> 'innocent minutiae' (as you might call them had you a penchant for |
28 |
> obsolete vocabulary or a predilection for pomposity) account for |
29 |
> approximately 20% of the total ebuild content in the tree. It is obvious |
30 |
> that an immediate ban upon these silly things, alongside a small script |
31 |
> to remove them from the tree, would provide a very large gain for our |
32 |
> users without having to remove any existing code. Adding in a repoman |
33 |
> check to error out if such lines were present would clearly be a good |
34 |
> start. |
35 |
> |
36 |
> Next up are blank lines, which, as all the world knows are of no use at |
37 |
> all to anyone. These account for a staggering 150KBytes of data in the |
38 |
> main tree, which, over a 9600 dialup line, would save us over two |
39 |
> minutes on an emerge sync. Again, removing these pointless wastes of |
40 |
> space via a bash script is trivial. |
41 |
> |
42 |
> Staying with the blank spaces thing, leading whitespaces (which serve no |
43 |
> practical purpose and are only used to make the code "look pretty" -- |
44 |
> although how a bash script could ever be considered "pretty" is beyond |
45 |
> my limited mind) account for nearly half a megabyte of data. Clearly |
46 |
> these should immediately be removed and any developer using them in the |
47 |
> future should have their cvs access suspended pending a review of their |
48 |
> status within the project -- as devrel and our managers will tell you, |
49 |
> being nice to the users is our number one priority. |
50 |
> |
51 |
> There are other trivial ways to save space too. The commonly used helper |
52 |
> function "emake", for example, is a shocking five bytes in length. |
53 |
> Replacing this with a much more helpfully named "e", and likewise |
54 |
> replacing "econf" with "c", would gain something like 50KBytes. If we |
55 |
> also replace src_unpack, src_compile and src_install with more |
56 |
> appropriate alternatives we could shave off a further 300KBytes. I have |
57 |
> no doubt that the reader could extend this logic to the other portage |
58 |
> internals and common function names, bring the total up to half a |
59 |
> megabyte or more. |
60 |
> |
61 |
> This can be extended to other functions, of course. In particular I'd |
62 |
> like to draw your attention to the absurdly named "flag-o-matic.eclass". |
63 |
> Merely inheriting this eclass adds at least thirteen bytes (that's over |
64 |
> a hundred bits!) of bloat to an ebuild, and that's before we start on |
65 |
> the ridiculously verbose function names. What's all this "replace-flags" |
66 |
> nonsense I ask you? Any educated programmer can see that "rf" is a far |
67 |
> more useful name. Even those who are not convinced that space needs to |
68 |
> be saved must surely notice how much developer time would be saved |
69 |
> through reduced typing. |
70 |
> |
71 |
> It remains a mystery to me how anyone could possibly have overlooked the |
72 |
> following suggestion. Currently, we install 'dependency information' |
73 |
> inside ebuilds. This is blatantly pointless -- as RedHat have so ably |
74 |
> demonstrated with their 'rpm' installer (and, albeit in a non-Linux |
75 |
> environment, I am assured that Microsoft are in the same boat), there is |
76 |
> no need for automatic dependency tracking and resolution. Our users are |
77 |
> more than capable of working this out for themselves. Similarly, the |
78 |
> HOMEPAGE variable is entirely pointless and has been supersede by Google |
79 |
> [1]. |
80 |
> |
81 |
> Oh, and then we come to metadata.xml. As all the world knows, xml is a |
82 |
> massive waste of space, and (as a data interchange format not a data |
83 |
> storage format) utterly unsuited for configuration files. A typical |
84 |
> metadata.xml file is 95%+ noise. By replacing these with flat text files |
85 |
> listing the maintainers, we could save somewhere in the region of one |
86 |
> and a half megabytes. |
87 |
> |
88 |
> Also, no-one has yet considered all the useless fluff in the tree that |
89 |
> nobody actually uses. By removing all ebuilds and eclasses related to |
90 |
> emacs, kde, gnome, php, gaim or java related from the tree, as well as |
91 |
> anything which is only supplied as a binary we could save... Well, I'll |
92 |
> let you do the calculations yourselves. Although mathematics is not the |
93 |
> main focus of my degree, I believe I understand enough to know that the |
94 |
> result is a very big number. |
95 |
> |
96 |
> Similarly, all those "compile fix" patches we supply are obviously |
97 |
> worthless. If anyone has any doubt, I suggest they just look at how |
98 |
> many users are using broken CFLAGS and compilers -- clearly, working |
99 |
> code is not a major concern. We should of course leave in security |
100 |
> patches, since security is our number one priority. |
101 |
> |
102 |
> ChangeLogs are the next thing to fall under my scrutiny. Clearly these |
103 |
> are entirely worthless, since anyone who cares can just read the cvs |
104 |
> logs and use diff. Kiss goodbye to 14MBytes of junk. Hang on? Did I just |
105 |
> say 14MBytes? Yes. Fourteen Megabytes. That's a one, then a four, then |
106 |
> six zeros. That's fourteen million bytes, or over one hundred and ten |
107 |
> million bits. When syncing my GPRS phone whilst sitting inside a large |
108 |
> metal cage in north Yorkshire, that could save me over TWELVE HOURS on |
109 |
> sync time. |
110 |
> |
111 |
> I understand that my previous point may cause a small amount of disquiet |
112 |
> amongst a small proportion of our userbase. After all, how are they |
113 |
> supposed to decide whether to update if they do not know what an update |
114 |
> will change? To them, I must point out that whilst such an attitude is |
115 |
> appropriate for a small hobbyist distribution aimed at skilled users, it |
116 |
> is utterly at odds with what enterprise users require. For them, it is |
117 |
> important that they can perform updates without having to know what they |
118 |
> are doing -- remember that in a corporate environment, any information |
119 |
> is too much information, and time spent reading ChangeLogs is time not |
120 |
> spent doing useful work. Please do not forget that better enterprise |
121 |
> support is our number one priority. |
122 |
> |
123 |
> Finally, I must draw KEYWORDS to your scrutiny, and in particular the |
124 |
> misguided choice of ~ to indicate unstable. In ASCII, the tilde |
125 |
> character is represented by the octet 0x7E (hexadecimal), or, in binary, |
126 |
> 01111110. A cursory glance at this will show that it contains |
127 |
> significantly more 1 bits than 0 bits. As anyone who has had a basic |
128 |
> schooling in the field of compression can tell you, 1 bits do not |
129 |
> compress as well as 0 bits (they don't have as much empty space in the |
130 |
> middle), so clearly we would be better off picking something else. I |
131 |
> propose the ( character, which has only one 1 bit for every four 0 bits. |
132 |
> Also, I suggest we drop the amd64 keyword and just use x86 to save |
133 |
> space, since we all know fine well that amd64 is just like x86 with a |
134 |
> few extra bits stuck onto the end. Or rather, the start, since x86 gets |
135 |
> its bytes backwards... |
136 |
> |
137 |
> Gentlemen, ladies, jforman, I believe those remedies outlined herein are |
138 |
> a far more sensible solution than any other current proposal. I eagerly |
139 |
> await the implementation. |
140 |
> |
141 |
> [1]: http://www.google.ca/ |
142 |
> |
143 |
> -- |
144 |
> Ciaran McCreesh : Gentoo Developer (Vim, Fluxbox, Sparc, Mips) |
145 |
> Mail : ciaranm at gentoo.org |
146 |
> Web : http://dev.gentoo.org/~ciaranm |
147 |
> |
148 |
> |
149 |
> |
150 |
|
151 |
-- |
152 |
gentoo-dev@g.o mailing list |