Gentoo Archives: gentoo-dev

From: Alec Warner <antarus@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] sources.gentoo.org instability
Date: Mon, 05 Dec 2011 17:08:37
Message-Id: CAAr7Pr_-Jsv=axz+u0+T5KORqbeqYLsov-Ans9P8E_Sq1mP_ng@mail.gmail.com
In Reply to: Re: [gentoo-dev] sources.gentoo.org instability by "Andreas K. Huettel"
1 On Mon, Dec 5, 2011 at 3:48 AM, Andreas K. Huettel <dilfridge@g.o> wrote:
2 >
3 > Seriously, what do we gain from crawlers accessing sources.gentoo.org?  I cant
4 > really remember seeing it once in a google query result...
5
6 We want the site searchable.
7
8 >
9 > Possibly it would not even be required to deny all requests, but just deny
10 > everything related to ancient history...
11 >
12 >> Hello,
13 >>
14 >> For a while sources.gentoo.org has been puttering along and its health
15 >> has slowly declined. We migrated it to some newer shiny hardware in an
16 >> attempt to mitigate the problem but that did not pan out. 90% (or
17 >> more) of sources.gentoo.org traffic is crawler bots and not actual
18 >> humans. That being said; if we cannot serve requests to the bots
19 >> within our timeouts we serve 500's instead which is never really what
20 >> we want (particularly when we spent 20s of CPU to calculate 80% of the
21 >> response only to see the client timeout :/.)
22 >>
23 >> The majority of the expensive requests are related to package.mask and
24 >> use.local.desc queries by crawlers. Like crawling the entire 13000 rev
25 >> history for package.mask (or similar.)
26 >>
27 >> While it is likely we will monkey patch viewvc to be less wasteful; in
28 >> the meantime I have removed use.local.desc from sources.gentoo.org
29 >> (and also anoncvs, because they share the same repo.) I hope this is a
30 >> short term (order of weeks) hack.
31 >>
32 >> -A
33 >
34 > --
35 > Andreas K. Huettel
36 > Gentoo Linux developer
37 > kde, sci, arm, tex, printing
38 >
39 >

Replies

Subject Author
Re: [gentoo-dev] sources.gentoo.org instability "Chí-Thanh Christopher Nguyễn" <chithanh@g.o>