1 |
On Thu, 2005-12-15 at 07:43 -0700, Duncan wrote: |
2 |
> > I was wondering if there are any sane ways to optimize the performance |
3 |
> > of a Gentoo system. |
4 |
> This really belongs on user, or perhaps on the appropriate purposed list, |
5 |
> desktop or hardened or whatever, not on devel. That said, some |
6 |
> comments... (I can't resist. <g>) |
7 |
-user has the risk of many "use teh -fomglol flag, it si teh fast0r" ;-) |
8 |
hardened doesn't have much to do with performance (although I'd be |
9 |
interested what impact - if any - the different security features have!) |
10 |
|
11 |
> > - don't overtweak CFLAGS. "-O2 -march=$your_cpu_family" seems to be on |
12 |
> > average the best, -O3 is often slower and can cause bugs |
13 |
> |
14 |
> A lot of folks don't realize the effect of cache memory on optimizations. |
15 |
> I'll be brief here, but particularly for things like the kernel that stay |
16 |
> in memory, -Os can at times work wonders, because it means more of the |
17 |
> working set stays in a cache closer to the CPU, and the additional speed |
18 |
> in retrieving that code far outweighs the compromises made to |
19 |
> optimizations to shrink it to size. Conversely, media streaming or |
20 |
> encoding apps are constantly throwing out old data and fetching new data, |
21 |
> and the optimizations are often more effective for them, so they work |
22 |
> better with -O2 or even -O3. |
23 |
I've not seen any substantial benefits from -Os over -O2. |
24 |
Also the size difference is quite small - ~5M on a "normal" install iirc |
25 |
|
26 |
> There have been occasional problems with -Os, generally because it isn't |
27 |
> used as much and gets less testing, so earlier in a gcc cycle series. |
28 |
> However, I run -Os here (amd64) by default, and haven't seen any issues |
29 |
> that went away if I reverted to -O2, over the couple years I've been |
30 |
> running Gentoo. |
31 |
I've seen some reproducable breakage, e.g. KDE doesn't like it at all |
32 |
> (Actually, that has been the case, even when I've edited |
33 |
> ebuilds to remove their stripflags calls and the like. Glibc and xorg |
34 |
> both stripflags including -Os. xorg seemed to benefit here from -Os after |
35 |
> I removed the stripflags call, while glibc worked but seemed slower. Note |
36 |
> that editing ebuilds means if it breaks, you get to keep the pieces!) |
37 |
... which is exactly what I wanted to avoid. Ricing for the sake of it is boring ;-) |
38 |
|
39 |
> For gcc, -pipe doesn't improve program optimization, but will make |
40 |
> compiling faster. -fomit-frame-pointers makes smaller applications if |
41 |
> you aren't debugging. Those are both common enough to be fairly safe. |
42 |
agreed |
43 |
> -frename-registers and -fweb may also be useful. (-fweb ceases to be so on |
44 |
> gcc4, however, because it is implemented differently.) -funit-at-a-time |
45 |
> (new to gcc-3.4, so don't try it with gcc-3.3) may also be worth looking |
46 |
> into, altho it's already enabled by -Os. These latter flags are less |
47 |
> commonly used, however, thus less well tested, and may therefore cause |
48 |
> very occasional problems. (-funit-at-a-time was known to do so early in |
49 |
> the 3.4 cycle, but those issues should have been long ago dealt with by |
50 |
> now.) I consider those /reasonably/ conservative, and it's what I run. |
51 |
> If I were running a server, however, I'd probably only run -O2 and the |
52 |
> first two (-pipe and -fomit-frame-pointers). |
53 |
on a server you'd not use omit-frame-pointers to keep debuggability I think. |
54 |
> Do some research on -Os, in any case. It could be well worth your time. |
55 |
from my (limited) experience it isn't, especially on CPUs with larger caches |
56 |
|
57 |
> This suggestion does involve hardware, but not a real heavy cost, and the |
58 |
> performance boost may be worth it. |
59 |
That's usually not an option :-) |
60 |
|
61 |
> Consider running a RAID system. I |
62 |
> recently switched to RAID, a four-disk setup, raid1/mirrored for /boot, |
63 |
> raid6 (for redundancy) for most of the system, raid0/striped (for speed) |
64 |
> for /tmp, the portage dir, etc, stuff that was either temporary anyway, or |
65 |
> could easily be redownloaded. (Swap can also be striped, set equal |
66 |
> partitions on each disk and set equal priority for them in fstab.) I was |
67 |
> very pleasantly surprised at how much of a difference it made! |
68 |
Yes. 4-disk raid5 delivers amazing performance with minimal CPU overhead (~10% @1Ghz) |
69 |
But 4 disks at 100Euro + controller (100 Eur) is more than the price of |
70 |
a "new" system for most people. |
71 |
> If you have |
72 |
> onboard SATA and are buying new disks so can buy SATA anyway (my case), |
73 |
> that should do just fine, as SATA runs a dedicated channel to each |
74 |
> drive anyway. SCSI is a higher cost option, ruled out here, but SATA |
75 |
> works very nicely, certainly so for me. |
76 |
SCSI does deliver better performance, but at a prohibitive cost for "average" users. |
77 |
|
78 |
> Again, a reasonable new-hardware suggestion. When purchasing a new system |
79 |
> or considering an upgrade, more memory is often the most effective |
80 |
> optimization you can make (with the raid suggestion above very close to |
81 |
> it). |
82 |
"The only thing better than a large engine is a larger engine" ;-) |
83 |
Depending on workload 4G does wonders, but again - prohibitive for the |
84 |
normal user. |
85 |
|
86 |
> Slower CPU and more memory, up to a gig or so, is almost always |
87 |
> better than the reverse, because hard drive access is WAYYY slower than |
88 |
> even cheap/slow memory. At a gig of memory, running with swap disabled is |
89 |
> actually a practical option, |
90 |
but if you're investing anyway keep 1G per disk for swap just in |
91 |
case ;-) |
92 |
> altho it might not be faster and there are a |
93 |
> certain memory zone management considerations. Usual X/KDE desktop usage |
94 |
> will run perhaps a third of a gig. That means half to 2/3 gig for cache, |
95 |
> which is "comfortable". |
96 |
Agreed, although I wonder why we need so much memory in the first |
97 |
place ... |
98 |
|
99 |
> Naturally, if you take the RAID suggestion above, |
100 |
> this one isn't quite as critical, because drive latency will be lower so |
101 |
> reliance on swap isn't as painful, and a big cache not nearly as critical |
102 |
> to good performance. |
103 |
latency is the same, but concurrent accesses can happen, thus throughput |
104 |
increases. |
105 |
Still memory > * ... |
106 |
|
107 |
> A gig to two gig can still be useful, but the |
108 |
> cost/performance tradeoff isn't as good, and the money will likely be |
109 |
> better spent elsewhere. |
110 |
No. The only thing better than memory is more memory ;-) |
111 |
|
112 |
> I run reiserfs here on everything. However, some don't consider it |
113 |
> extremely stable. I keep second-copy partitions as backups of stuff I |
114 |
> want to ensure is safe, for that reason and others (fat-finger deleting, |
115 |
> anyone?). |
116 |
Backups are independent of drive speed ;-) |
117 |
> Bottom line, reiserfs is certainly safe "enough", if you have a |
118 |
> decent backup system in place, and you follow it regularly, as you should. |
119 |
> I can't see how anyone can reasonably disagree with that, filesystem |
120 |
> religious zealousy or not. |
121 |
In my experience it is as "safe" as ext3 and XFS, meaning it can go down, but usually just works. |
122 |
|
123 |
> As I said, I run reiserfs for everything here, but I also have backup |
124 |
> images of stuff I know I want to keep. |
125 |
Always backup, what if your disk(s) die? |
126 |
I've seen 6 out of 10 disks in a RAID die within a few hours ... |
127 |
|
128 |
So while not completely related to software tweaks thanks for the |
129 |
hardware upgrade info ;-) |
130 |
|
131 |
Patrick |
132 |
-- |
133 |
Stand still, and let the rest of the universe move |