List Archive: gentoo-dev
Note: Due to technical difficulties, the Archives are currently not up to date.
provides an alternative service for most mailing lists.c.f. bug 424647
Seriously, what do we gain from crawlers accessing sources.gentoo.org? I cant
really remember seeing it once in a google query result...
Possibly it would not even be required to deny all requests, but just deny
everything related to ancient history...
> For a while sources.gentoo.org has been puttering along and its health
> has slowly declined. We migrated it to some newer shiny hardware in an
> attempt to mitigate the problem but that did not pan out. 90% (or
> more) of sources.gentoo.org traffic is crawler bots and not actual
> humans. That being said; if we cannot serve requests to the bots
> within our timeouts we serve 500's instead which is never really what
> we want (particularly when we spent 20s of CPU to calculate 80% of the
> response only to see the client timeout :/.)
> The majority of the expensive requests are related to package.mask and
> use.local.desc queries by crawlers. Like crawling the entire 13000 rev
> history for package.mask (or similar.)
> While it is likely we will monkey patch viewvc to be less wasteful; in
> the meantime I have removed use.local.desc from sources.gentoo.org
> (and also anoncvs, because they share the same repo.) I hope this is a
> short term (order of weeks) hack.
Andreas K. Huettel
Gentoo Linux developer
kde, sci, arm, tex, printing