Gentoo Archives: gentoo-server

From: dormando <dormando@×××××.net>
To: gentoo-server@l.g.o
Subject: Re: [gentoo-server] Re: [OT] Mirroring/backing-up a large
Date: Sat, 29 Apr 2006 00:17:22
Message-Id: 4452AFC0.3040408@rydia.net
In Reply to: Re: [gentoo-server] Re: [OT] Mirroring/backing-up a large by kashani
1 kashani wrote:
2 > A. Khattri wrote:
3 >> On Wed, 19 Apr 2006, kashani wrote:
4 >>
5 >>> I'm not sold on the Google approach.
6 >>>
7 >>> Assuming someone was to build nine data servers we're talking
8 >>> roughly
9 >>> $3k per server (dual CPU, 4GB ram, raid 5 sata) or $30k with shipping
10 >>> and tax.
11 >>
12 >> Actually Google use the cheapest hardware they can find, buy in bulk, and
13 >> they assume stuff will fail so they plan accordingly. I very much doubt
14 >> they spend $3K per server...
15 >>
16 >
17 > Anyone trying to build the same with a purchase of less than 100 servers
18 > is not going to spend much less.
19 >
20 > 2 x 2GB RAM = $1000
21 > 2 x CPU = $500 or so
22 > Raid Card = $300
23 > Drives = $100 each
24 > 1u chassis/MB/etc = $500
25 >
26 > IIRC they drop the drives and shove everything into RAM. Which is fine
27 > when you have a limited data set and enough machines to shove it into
28 > RAM. Originally Google did have a limited data set. It was only after
29 > the infrastructure reached a critical size that they began Google mail
30 > and other large storage things. And had fours years to work out the
31 > operational kinks.
32 >
33 > I think it unlikely that someone with a single storage set has enough
34 > money or time to pay for a few Phds to write a custom filesystem, a few
35 > hundred servers (10TB/4GB), and the datacenter monkey necessary to
36 > replace gear constantly... oddly Google has both of these in spades. And
37 > if you read the whitepaper the smallest Google data cluster is nineteen
38 > servers, aka $40-60k for schleps like us, aka the cost of a SAN that
39 > that uses less power and burns less switch ports.
40 >
41 > This infatuation with the Google stuff that very few people (ie none of
42 > us) have the in house infrastructure to handle or the available cash is
43 > useless. Unless someone has actually built their own mini Google and
44 > wants to tell us all about it with nice numbers like total cost, source
45 > code, transactions per second, cost per GB of user data, throughput, and
46 > other data points.
47 >
48 > kashani
49
50 Sorry, I'll have to pipe up about that... There's at least one guy here
51 using mogilefs, which is basically a google approach :) I'm using it as
52 well.
53
54 There are a few nice things about distributing your files around:
55
56 - You don't necessarily need to dedicate machines to it. I have a
57 hundred and change diskless boot webservers. I add two harddrives to a
58 box, fire up the mogstored daemon, and put it back into the webserver pool.
59 - You're overspeccing. Why would each box need 4G of RAM? Whatever
60 NAS/SAN you buy will *not* have nearly that much cache. If you're going
61 to live without, live without :) A single dualcore chip or less could
62 work too.
63 - Something like MogileFS uses node-level redundancy (NAID? Someone
64 bothered giving it a name...), so there's no point in buying RAID cards.
65 Use onboard SATA/PATA or the cheapest cards you can buy that will give
66 decent throughput ($50 or less instead of $300).
67 - If it floats your boat, go ahead and get those 3U supermicro cases
68 with 16 drive bays. Just use a couple cheap 4+ port SATA hot swap
69 capable controllers instead of 2x$500+ battery backed RAID controllers.
70 Get a bunch of them as slimmed down as possible. Hell, just fuse some
71 steel together, tool an mb/PSU to it, and stack a ton of drives in front
72 of some huge fans. Save 2/3rds of the case cost.
73
74 My case might not be enough like yours, but I can easily add terabytes
75 of space for *just* the cost of the drives (and the relatively small
76 power addage per box I add drives to). There are a few other services
77 too though... Mogile needs a central DB (which you'd have two of,
78 right?) and tracker services. Again, cheap is fine. Maybe you have a DB
79 with some free resources... I just can't imagine someone selling me a
80 NAS/SAN that's this scalable and this cheap.
81
82 have fun,
83 -Dormando
84 --
85 gentoo-server@g.o mailing list

Replies

Subject Author
Re: [gentoo-server] Re: [OT] Mirroring/backing-up a large "A. Khattri" <ajai@××××.net>