1 |
Shaochun Wang <scwang@××××××.cn> posted 20070803065913.GA23254@localhost, |
2 |
excerpted below, on Fri, 03 Aug 2007 14:59:13 +0800: |
3 |
|
4 |
> Every time I compile C++ code, e.g. app-i18n/scim-qtimm, my desktop |
5 |
> system becomes almost not |
6 |
> interactive. I have already set PORTAGE_NICENESS="15" in /etc/make.conf. |
7 |
> |
8 |
> Any suggestion? |
9 |
|
10 |
Do you have a single-core single-CPU system, or multi-one-or-the-other? |
11 |
|
12 |
In any case, unless you are running folding@home or the like, something |
13 |
truly idle-only that you want the emerge to get higher priority than, you |
14 |
should consider PORTAGE_NICENESS=19. The reason being, a +19 nice is |
15 |
treated as idle priority by the scheduler, giving the rest of the system |
16 |
slightly higher responsiveness (lower latency), while giving the idle |
17 |
task somewhat longer time slices. The effect can actually be to /speed/ |
18 |
/up/ compiles over a positive niceness <19 or even over normal |
19 |
scheduling, due to the longer timeslices. |
20 |
|
21 |
What clock tick setting are you using, and what is your preemption |
22 |
setting? Particularly if you are single-core and CPU, a higher clock |
23 |
tick setting (Timer frequency), not 100 certainly (that's for servers), |
24 |
probably 300 or 1000, will increase responsiveness at the cost of tasks |
25 |
taking longer (shorter timeslices, more overhead processing timeslices |
26 |
each second). Similarly with preemption. You'll want that set to |
27 |
Preemptible Kernel (Low-Latence Desktop) or at least Voluntary Kernel |
28 |
Premption (Desktop). Also be sure the Preempt the Big Kernel Lock option |
29 |
is toggled ON. |
30 |
|
31 |
Conversely, lower settings, No Forced Preemption (Server) or Voluntary |
32 |
Kernel Preemption (Desktop), and 100, 250 or 300 tick rate should work |
33 |
better for multi-core or multi-CPU SMP systems, because they can spread |
34 |
the load a bit more. FWIW, dual Opteron 242 (so dual single cores) here, |
35 |
I'm running voluntary preemption, BKL preemption, and 300 Hz tick |
36 |
frequency. That's the highest I'd recommend for a general purpose multi- |
37 |
core or multi-CPU system, but tho I'd recommend as above, 1000 Hz tick |
38 |
and full preemption for single CPU/core systems. |
39 |
|
40 |
You may also wish to play with the MAKEOPTS setting, typically -jX, where |
41 |
X is the number of CPUs/cores plus one to 150% of the CPUs/cores. Thus, |
42 |
a single core/cpu system's recommended setting is -j2. However, you may |
43 |
find -j1 increases your responsiveness. Or you can try -j2, but add -l1 |
44 |
or the like. With GNU make (but not all others, you may have to remove |
45 |
the -lX portion for some merges), that'll tell it to allow up to two jobs |
46 |
(the -j2) but ONLY start a second one if the load average is below 1 (- |
47 |
l1). That's generally fairly effective. |
48 |
|
49 |
There are other things to consider as well. How do you usually compile, |
50 |
in a terminal window (xterm, konsole, gterm, etc) or at the text |
51 |
console? I've noted that at least with konsole and with composite |
52 |
rendering (real transparency) turned on, non-niced CPU usage goes thru |
53 |
the roof trying to keep the konsole updated at times. Causing the |
54 |
konsole window not to display, either minimizing it, shading it, or |
55 |
switching to a different desktop workspace so the konsole isn't shown, |
56 |
eliminates the issue. If I'm /really/ planning on going to town (say a |
57 |
new KDE release came out and I have 100 plus packages of mostly C++ to |
58 |
compile), I'll turn off composite rendering as well. When there's rapid |
59 |
display updates such as when compile output is scrolling by, if X and the |
60 |
X clients don't have to do all that extra work drawing and compositing |
61 |
areas normally hidden by other windows, it makes a big difference. (It |
62 |
should be noted that typical composite overhead is <5% of a single CPU, |
63 |
here, more like 2% unless I've a huge bunch of windows open, and that's |
64 |
on a dual 1600x1200 display. Radeon 9250, the last Radeons for which |
65 |
there's decent freedomware drivers, FWIW, tho the reverse engineering |
66 |
effort on the r300 and r400 series is progressing nicely.) |
67 |
|
68 |
Then there's the standard stuff, but it'd tend to affect more than simply |
69 |
C++ compiling. Make sure your SATA/PATA/SCSI chipsets are running their |
70 |
correct drivers with DMA enabled, not just generic, no DMA compatibility |
71 |
mode. That's a big one but if it affected you you'd probably notice it |
72 |
elsewhere as well. |
73 |
|
74 |
If you have lots of memory (2 gigs or better, 4 gigs is nice, I have 8 |
75 |
but that's overkill), strongly consider setting up your PORTAGE_TMPDIR |
76 |
(/var/tmp by default) on tmpfs. Having all those temporary files |
77 |
typically used during a compile and merge written to memory only, instead |
78 |
of having to wait for several orders of magnitude slower hard disk |
79 |
access, DRAMATICALLY speeds up compiles, while at the SAME time speeding |
80 |
up general system responsiveness during the merge, because disk access |
81 |
slows down the /entire/ system, especially when whatever else you are |
82 |
working on is trying to access the disk at the same time. |
83 |
|
84 |
To give you an example of what things /can/ be like, with my now aging |
85 |
dual Opteron 242 setup here (soon to be upgraded to dual-core Opteron |
86 |
290s), with 8 gigs memory (as I said, overkill, 4 gig would be fine), / |
87 |
tmp on tmpfs (with $PORTAGE_TMPDIR=/tmp), 4-disk RAID-6 system (two |
88 |
parity stripes, so effectively 2-way striped), RAID-0 $PORTDIR and |
89 |
ccache, I routinely run MAKEOPTS="-j1000" (not that it ever gets there, |
90 |
but some builds don't like the unlimited -j, no number), and run five or |
91 |
more parallel emerges at the same time (using emerge -pt to get a tree |
92 |
output and -a for verification, so I can setup non-conflicting parallel |
93 |
emerges). That's usually for a KDE update where I have 100 or so C++ KDE |
94 |
packages to merge, so it's mostly C++. Because the config and certain |
95 |
other sections aren't parallelized and thus do only a single job, even |
96 |
with that, my load average seldom rises above 20 or 25. Or, on the |
97 |
kernel, which is C not C++ but parallelizes VERY well, I'll get a load |
98 |
average of several /hundred/. It's fun to see it go that high! =8^) |
99 |
Still, even with a 300-500 load average or 20ish load average on C++ (the |
100 |
most I seem to hit is 30), while I do get a bit of lag on the mouse, and |
101 |
the panel clock and ksysguard displays sometimes freeze for 10 seconds at |
102 |
a time, it's still surprising to me it's not /entirely/ unusable. As |
103 |
well, I can be playing an Internet radio stream the entire time (and no, |
104 |
I don't have anything set real-time, either), with few if any dropouts at |
105 |
all. That's REALLY astounding to me! Up to a 500 load average, yet the |
106 |
scheduler continues to work well enough to give the network, player and |
107 |
audio system all the time it needs to prevent both dropped network |
108 |
packets and dropped audio data! (It's obvious the scheduler prioritizes |
109 |
both the IP stack and the audio system without intervention, and equally |
110 |
obvious the KDE panel doesn't get the same auto prioritization, as |
111 |
clearly, 10 second updates as on the panel simply wouldn't cut it on the |
112 |
network or audio stream. That be as it may, it's still fascinating to |
113 |
watch, seeing the load average climb to several hundred when I |
114 |
deliberately compile a kernel with -j1000, without a single hitch or skip |
115 |
in the audio playback at all!) |
116 |
|
117 |
-- |
118 |
Duncan - List replies preferred. No HTML msgs. |
119 |
"Every nonfree program has a lord, a master -- |
120 |
and if you use the program, he is your master." Richard Stallman |
121 |
|
122 |
-- |
123 |
gentoo-amd64@g.o mailing list |