On Tuesday 12 October 2004 9:37 pm, Ciaran McCreesh wrote:
> It has come to my attention that, during recent weeks, a small number of
> users have been complaining recently about the size of the rsync tree.
> My august colleagues have proposed many ingenious solutions, but
> misfortunately they are all complicated and involve a lot of manual
> work. I believe the following small changes (which can mostly be
> automated) would prove of much larger benefit to the community for a
> vastly reduced cost.
The tree size is only a variable for users because they are required to have
the entire tree on their computer. If Portage fetched these on-demand, it
wouldn't be a problem. But that's already been discussed...
> To begin with, I'd like to draw your attention to comments in ebuilds.
> It is an oft-forgotten fact that these items provide absolutely no
> benefit to the end user. "Surely", I hear you say, "it is not worth
> getting hung up over such an insignificant triviality! What harm do a
> few trifling little remarks do?". Yet, when actually measured, these
> 'innocent minutiae' (as you might call them had you a penchant for
> obsolete vocabulary or a predilection for pomposity) account for
> approximately 20% of the total ebuild content in the tree. It is obvious
> that an immediate ban upon these silly things, alongside a small script
> to remove them from the tree, would provide a very large gain for our
> users without having to remove any existing code. Adding in a repoman
> check to error out if such lines were present would clearly be a good
Even if they do take up a lot of space, they are often important so other
devs/users can know why something was done. Perhaps the copyright comments
should be removed, though, as copyrights exist with or without declaration of
> Next up are blank lines, which, as all the world knows are of no use at
> all to anyone. These account for a staggering 150KBytes of data in the
> main tree, which, over a 9600 dialup line, would save us over two
> minutes on an emerge sync. Again, removing these pointless wastes of
> space via a bash script is trivial.
Blank lines are often nice for readability. 150K isn't much; if you're on
dialup, two minutes is nothing, not to mention the fact that nobody should be
on 9600 let alone 28.8... Also, that 2 minutes is assuming you're just
downloading the tree. Rsync isn't a simple file transfer protocol.
> Staying with the blank spaces thing, leading whitespaces (which serve no
> practical purpose and are only used to make the code "look pretty" --
> although how a bash script could ever be considered "pretty" is beyond
> my limited mind) account for nearly half a megabyte of data. Clearly
> these should immediately be removed and any developer using them in the
> future should have their cvs access suspended pending a review of their
> status within the project -- as devrel and our managers will tell you,
> being nice to the users is our number one priority.
Yet another readability issue. Half a meg isn't much.
> There are other trivial ways to save space too. The commonly used helper
> function "emake", for example, is a shocking five bytes in length.
> Replacing this with a much more helpfully named "e", and likewise
> replacing "econf" with "c", would gain something like 50KBytes. If we
> also replace src_unpack, src_compile and src_install with more
> appropriate alternatives we could shave off a further 300KBytes. I have
> no doubt that the reader could extend this logic to the other portage
> internals and common function names, bring the total up to half a
> megabyte or more.
Developers are volunteers... are you seriously suggesting killing all
readability just to make syncs a bit shorter? I wonder how many current
volunteers would tolerate this.
> This can be extended to other functions, of course. In particular I'd
> like to draw your attention to the absurdly named "flag-o-matic.eclass".
> Merely inheriting this eclass adds at least thirteen bytes (that's over
> a hundred bits!) of bloat to an ebuild, and that's before we start on
> the ridiculously verbose function names. What's all this "replace-flags"
> nonsense I ask you? Any educated programmer can see that "rf" is a far
> more useful name. Even those who are not convinced that space needs to
> be saved must surely notice how much developer time would be saved
> through reduced typing.
And annoyed by having to memorize what meaningless abbreviates mean...
> It remains a mystery to me how anyone could possibly have overlooked the
> following suggestion. Currently, we install 'dependency information'
> inside ebuilds. This is blatantly pointless -- as RedHat have so ably
> demonstrated with their 'rpm' installer (and, albeit in a non-Linux
> environment, I am assured that Microsoft are in the same boat), there is
> no need for automatic dependency tracking and resolution. Our users are
> more than capable of working this out for themselves. Similarly, the
> HOMEPAGE variable is entirely pointless and has been supersede by Google
Ok, suggesting removal of dependency info has me convinced this is a bad
> Oh, and then we come to metadata.xml. As all the world knows, xml is a
> massive waste of space, and (as a data interchange format not a data
> storage format) utterly unsuited for configuration files. A typical
> metadata.xml file is 95%+ noise. By replacing these with flat text files
> listing the maintainers, we could save somewhere in the region of one
> and a half megabytes.
And I'm sure rsync can probably filter out *.xml client-side...
> Also, no-one has yet considered all the useless fluff in the tree that
> nobody actually uses. By removing all ebuilds and eclasses related to
> emacs, kde, gnome, php, gaim or java related from the tree, as well as
No more reading. It's a joke.
firstname.lastname@example.org mailing list