Gentoo Archives: gentoo-dev

From: "Andreas K. Huettel" <dilfridge@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] sources.gentoo.org instability
Date: Mon, 05 Dec 2011 11:49:47
Message-Id: 2345971.y9mealLpxW@grenadine
In Reply to: [gentoo-dev] sources.gentoo.org instability by Alec Warner
1 Seriously, what do we gain from crawlers accessing sources.gentoo.org? I cant
2 really remember seeing it once in a google query result...
3
4 Possibly it would not even be required to deny all requests, but just deny
5 everything related to ancient history...
6
7 > Hello,
8 >
9 > For a while sources.gentoo.org has been puttering along and its health
10 > has slowly declined. We migrated it to some newer shiny hardware in an
11 > attempt to mitigate the problem but that did not pan out. 90% (or
12 > more) of sources.gentoo.org traffic is crawler bots and not actual
13 > humans. That being said; if we cannot serve requests to the bots
14 > within our timeouts we serve 500's instead which is never really what
15 > we want (particularly when we spent 20s of CPU to calculate 80% of the
16 > response only to see the client timeout :/.)
17 >
18 > The majority of the expensive requests are related to package.mask and
19 > use.local.desc queries by crawlers. Like crawling the entire 13000 rev
20 > history for package.mask (or similar.)
21 >
22 > While it is likely we will monkey patch viewvc to be less wasteful; in
23 > the meantime I have removed use.local.desc from sources.gentoo.org
24 > (and also anoncvs, because they share the same repo.) I hope this is a
25 > short term (order of weeks) hack.
26 >
27 > -A
28
29 --
30 Andreas K. Huettel
31 Gentoo Linux developer
32 kde, sci, arm, tex, printing

Replies

Subject Author
Re: [gentoo-dev] sources.gentoo.org instability Alec Warner <antarus@g.o>