1 |
MIkey wrote: |
2 |
> The second step is to ditch storing everything on a single 9TB system that |
3 |
> cannot be backed up efficiently. Distribute the storage of the images on |
4 |
> clusters or whatever. For example peel of 1TB of images onto a single |
5 |
> server, then update the database (or apache/squid mapping) to point to the |
6 |
> new location. 9 1TB boxes would be far less prone to catastrophic failure |
7 |
> and much easier to replicate/mirror/backup than a single 9TB box. This is |
8 |
> what I call the "google approach" ;) Use cheap commodity hardware and |
9 |
> smart implementation to distribute/scale the load. |
10 |
> |
11 |
> Of course the ultimate solution would some sort of cluster or san |
12 |
> approach... |
13 |
> |
14 |
|
15 |
I'm not sold on the Google approach. |
16 |
|
17 |
Assuming someone was to build nine data servers we're talking roughly |
18 |
$3k per server (dual CPU, 4GB ram, raid 5 sata) or $30k with shipping |
19 |
and tax. On top of that I now have to manage nine boxes and manage my |
20 |
data in nine different places. These 9 servers are going to pull 18A of |
21 |
power and uses 18U of rack space. Whereas $35k gets me an NFS/iSCSI/cifs |
22 |
head (of admittedly third tier storage) and two 16 x 500GB shelves or |
23 |
12TB usable if I split each shelf into two RAID 6 partitions. This setup |
24 |
pulls 14A, uses 8U, has volume management, snapshots, can expand easily, |
25 |
and eventually cluster the heads if I'm willing to buy the license later. |
26 |
The 9 x 1TB setup might be worth the pain if you had the application |
27 |
written to deal with that and needed more of your data in RAM. For a |
28 |
community photo site I'm not sure you do. Additionally I don't think it |
29 |
helps solve the original problem of backing data up somewhere, but maybe |
30 |
I'm missing something. |
31 |
|
32 |
In any case I'm not saying you need to spend $30k to fix the problem, |
33 |
but if you plan to drop some money on the problem really sit down a |
34 |
figure initial cost, cost to expand, rack space, power, cooling, |
35 |
maintenance costs, administration costs, etc and relate it all back to a |
36 |
$/GB so you can compare apples to apples. |
37 |
|
38 |
|
39 |
In order to get better backups you might consider hashing your data a |
40 |
bit more on the filesystem. |
41 |
|
42 |
What you've got now |
43 |
/data/00000-50000/file|thumb|etc |
44 |
|
45 |
what might work better |
46 |
/data/1e/01ac/cdd98a910ca1d4e37b39a9197e/file|thumb|etc |
47 |
|
48 |
And then you can run through each tree and only sync the subdirs you |
49 |
need. I'm not certain this idea is the right way to go long term, but |
50 |
might be easy to implement now. I would not use more than three layers |
51 |
of directories. |
52 |
|
53 |
kashani |
54 |
-- |
55 |
gentoo-server@g.o mailing list |