1 |
jos houtman wrote: |
2 |
> First of all, Thank you for the many reply's and interesting discussions. |
3 |
> |
4 |
> Let me tell you what we concluded, after some tests it was obvious that |
5 |
> 10k files per directory was far better then the 50k we use now. |
6 |
> The longest rsync on 10K files took 15 seconds, avarage took about 8 |
7 |
> seconds. That is with the 9TB system using jfs and the 4TB using reiserfs. |
8 |
> We intend to use rsync in a combination with marking the folders dirty. |
9 |
> |
10 |
> This method should scale well enough for us, figures indicate we might |
11 |
> have 100TB by the end of the year. |
12 |
> |
13 |
> I believe there is no ultimate solution for a company like ours, we are |
14 |
> constantly trying to find better solutions, Some new website feature |
15 |
> requires other hardware setups for optimality, Bottle necks are common. |
16 |
> therefor what would now be the ultimate solution, might not be so in a |
17 |
> few weeks. |
18 |
> But we will keep looking to better solutions for storage, backup and all |
19 |
> other areas. But we will have to do it as problems arise, resources are |
20 |
> spread a little thin. |
21 |
> The Just in Time concept has penetrated to systemmanagement. |
22 |
|
23 |
There are a few things you can try to make what you've got faster or at |
24 |
least get them into your plan for the future. |
25 |
1. Smaller drives have better seek time. Basically the whole more |
26 |
spindles per data thing. Dealt with a very large mail system in '01 and |
27 |
the change from 36GB drives to 72GB drives decreased I/O throughput |
28 |
enough where we had to swap back to the smaller drives. 500GB SATA |
29 |
drives look great on paper, but 300GB drives might perform better. |
30 |
2. Cache more at your web layer and keep I/O off your storage. Run all |
31 |
webservers with the most RAM you an afford, if it's in the local cache |
32 |
it's not a storage hit, reverse proxy squid your webservers and set a |
33 |
local Squid cache to serve files directly from a purpose built proxy |
34 |
with fast local disk, a dedicated cache layer doing the same thing that |
35 |
you might redirect to, a media cluster that doesn't have the overhead |
36 |
that comes with running PHP, Perl, whatever on the main site, lots of |
37 |
interesting things here, well that just makes the dite faster. |
38 |
3. If 5% of your content is 90% of your bandwidth then a content |
39 |
delivery system makes sense. However uploading a data set in the TB |
40 |
range is not cost effective. |
41 |
4. Smaller disk groups on your storage. An EMC engineer explained this |
42 |
one to me. Say you've got sixteen drives in your array. Rather than a |
43 |
single RAID 5 set, you make three sets of RAID 5 with a floating hot |
44 |
spare. Each set has it's own data so when you look for fileA you hit |
45 |
drives 1-5 rather than all fifteen. The smaller data set means you get |
46 |
less violent random requests across the cluster, each drive is more |
47 |
likely to have a cache hit since you aren't support the whole data set, |
48 |
and so on. |
49 |
5. Rumor is that iSCSI is faster and has less overhead. You might want |
50 |
to test both NFS and iSCSI. Also don't believe any of the nonsense about |
51 |
needing TOE cards or dedicated HBA cards for either. Just be able to |
52 |
dedicate an ether interface to storage. |
53 |
6. Jumbo Frames. Assuming part of your problem is NFS data ops |
54 |
switching to jumbo frames would increase packet sizes from 1500 bytes to |
55 |
9000 bytes and cut your data ops. I just about doubled throughput by |
56 |
using jumbo packets with iSCSI back video streaming service. However |
57 |
this only works if you have a dedicated storage LAN and set all servers, |
58 |
clients, and switch ports to use jumbo frames, MTU 9000. Using jumbo |
59 |
frames on the "going out to the Internet side" is usually problematic. |
60 |
Also some switches don't support jumbo frames. |
61 |
|
62 |
7. Graph the hell out of everything. MRTG, Cacti, Excel, whatever. I |
63 |
can not stress this one enough. It's saved my ass a number of times over |
64 |
the past ten years. Having graphs of load, RAM usage, storage, local |
65 |
I/O, network I/O; Mysql queries, scans, table locks, cache hits, full |
66 |
table scans, etc; NFS ata ops, Apache processes, etc makes |
67 |
troubleshooting a million times easier. And it's great for getting more |
68 |
money out of management when you can prove the storage is doing twice |
69 |
the work it was doing three months ago. |
70 |
|
71 |
kashani |
72 |
-- |
73 |
gentoo-server@g.o mailing list |