Gentoo Archives: gentoo-dev

From: Chris Gianelloni <wolf31o2@g.o>
To: gentoo-dev@l.g.o
Subject: Re: [gentoo-dev] Optimizing performance
Date: Thu, 15 Dec 2005 14:29:15
Message-Id: 1134656014.21439.28.camel@vertigo.twi-31o2.org
In Reply to: [gentoo-dev] Optimizing performance by Patrick Lauer
1 On Thu, 2005-12-15 at 13:48 +0100, Patrick Lauer wrote:
2 > - don't overtweak CFLAGS. "-O2 -march=$your_cpu_family" seems to be on
3 > average the best, -O3 is often slower and can cause bugs
4
5 -O2 -march=$your_cpu_family -pipe -fomit-frame-pointer
6
7 -pipe
8 Use pipes rather than temporary files for communication between
9 the various stages of compilation. This fails to work on some
10 systems where the assembler is unable to read from a pipe; but
11 the GNU assembler has no trouble.
12
13 -O also turns on -fomit-frame-pointer on machines where doing so does
14 not interfere with debugging.
15
16 (However, x86 is not one of these machines, so you can turn it on if you
17 are not a developer doing debugging for a slight additional speed
18 increase)
19
20 -fomit-frame-pointer
21 Don't keep the frame pointer in a register for functions that
22 don't need one. This avoids the instructions to save, set up and
23 restore frame pointers; it also makes an extra register
24 available in many functions.
25
26 > - don't do anything with ASFLAGS, LDFLAGS. This causes weird random
27 > breakage (e.g. LDFLAGS="-O1" causes prelink to fail with "absurd"
28 > errors) and doesn't give a noticeable performance boost
29
30 Correct.
31
32 Also, running prelink can improve speed at the cost of disk space.
33
34 > - check that all IDE disks use DMA mode, otherwise they are limited to
35 > ~16M/s with a huge CPU usage penalty. Sometimes (application-specific)
36 > increasing the readahead with hdparm gives a huge throughput boost.
37
38 I typically use the same hdparm settings as listed in the Handbook:
39
40 disc0_args="-d1 -A1 -m16 -u1 -a64 -c1"
41 cdrom0_args="-d1 -c1"
42
43 > - kernel tweaks like preempt may increase the responsiveness of the
44 > system, but often reduce throughput and may have unexpected sideeffects
45 > like random audio stutter as well as random kernel crashes ;-)
46
47 This is especially true on non-x86 architectures.
48
49 > - kernel tweaks like setting swappiness or using a different I/O
50 > scheduler (CFQ, deadline) should help, but I'm not aware of any "real"
51 > benchmarks except microbenchmarks (can create 1M files 10% faster!!!!! -
52 > yes, but how does it behave with a normal workload?)
53
54 CFQ is much worse for a desktop system. I tend to like deadline for
55 playing games. These can probably make a bit more difference than a new
56 -fomg-itsofast-and-broken-math added to CFLAGS.
57
58 > - using a "smarter" filesystem can dramatically improve performance at
59 > the potential cost of reliability. As data on FS reliability is hard to
60 > find from unbiased sources this becomes a religious issue ... migrating
61 > from ext3 to reiserfs makes "emerge sync" extremely much faster, but is
62 > reiserfs sustainable?
63
64 Well, reiserfs 3 isn't so bad on architectures where it doesn't vomit
65 all over itself immediately. Also, resierfs loses much of its luster if
66 you're running ext3 with dir_index. There was a tip in the GWN about
67 turning on dir_index on an already formatted file system. If formatting
68 a new one, just use mkfs.ext2 -J -O dir_index /dev/$whatever to create
69 your file system.
70
71 > Are there any application-specific tweaks (e.g. "use the prefork MPM
72 > with apache2")? What is known to break things, what has usually
73 > beneficial behaviour? Are there any useful benchmarks that show the
74 > performance difference between different settings?
75
76 Well, turning on SBA and Fast Writes on Nvidia always helps. As for
77 benchmarks, I think the issue is it depends entirely on usage. Having
78 something that is 30% faster on paper isn't very useful if you never do
79 it the way the benchmark does. I wish I had more numbers/examples here,
80 but there isn't really much in the way of decent benchmarks published
81 and readily available. Hopefully some other people will know of more of
82 them than I do.
83
84 --
85 Chris Gianelloni
86 Release Engineering - Strategic Lead
87 x86 Architecture Team
88 Games - Developer
89 Gentoo Linux

Attachments

File name MIME type
signature.asc application/pgp-signature

Replies

Subject Author
Re: [gentoo-dev] Optimizing performance Wernfried Haas <amne@g.o>