1 |
Thomas Rösner <Thomas.Roesner@××××××××××××××.de> posted |
2 |
45ABA032.4070601@××××××××××××××.de, excerpted below, on Mon, 15 Jan 2007 |
3 |
16:39:30 +0100: |
4 |
|
5 |
> Compiling the kernel with -j is a popular benchmark, because it really |
6 |
> stresses the VM/disk/CPU. And before you get your hopes up too high: the |
7 |
> ebuilds that really take long (mozilla, openoffice, glibc, gcc) won't |
8 |
> use your makeopts anyway. |
9 |
> |
10 |
> My guess; going higher than -j5 won't do much for you, there will always |
11 |
> be a process not waiting for IO (if your disk can handle the load, that |
12 |
> is) for each CPU. -j3 will be better for cpp compiles, which hog the CPU |
13 |
> longer and won't have to be scheduled out like with -j5. |
14 |
> |
15 |
> Other factors: is this a desktop system? Do you want to actually do |
16 |
> something with it while it compiles? How much RAM do you have? |
17 |
> |
18 |
> (These are rethorical questions ;-)) |
19 |
|
20 |
Rhetorical or not, I've been curious at just how parallelizable things |
21 |
such as kernel compiles actually are. I have 8 gigs of memory now, and a |
22 |
dual Opteron (242, to be upgraded to dual cores soon) that I was running |
23 |
at -j5 to -j8 for kernel compiles (set in a patch routinely applied by my |
24 |
kernel maintenance scripts) for some time. However, recently, I've tried, |
25 |
apparently depending on make version, either -j (unlimited in some, I get |
26 |
a warning now and it reduces it to -j1, which isn't any fun...) or -j1000, |
27 |
just to see how high I could make my load average climb! =8^). |
28 |
|
29 |
I've been frustrated due to being unable to find an easy way to measure |
30 |
load averages of less than the 1 minute rolling, but with even with it, |
31 |
I've been highly amused to see it climb to something over 250! (I want to |
32 |
say 450 as I think I remember that, but I'm keeping the claim to what |
33 |
I know I've seen, several times. =8^) It's still fascinating to me to see |
34 |
how well the AMD64 arch and kernel copes with that, with the Linux kernel |
35 |
naturally scheduling interactive processes (which generally spend a lot of |
36 |
time idling, waiting for input) at a higher priority, thus keeping the |
37 |
system amazingly smoothly running even at that sort of load average, |
38 |
without any nicing on my part other than what the kernel does normally. =8^) |
39 |
|
40 |
Memory-wise, four gig would be plenty, even for semi-contrived usage like |
41 |
this one, but I expect to keep this system for a a couple more years yet |
42 |
and as I said will be upgrading to dual-dual-cores, so I decided I might |
43 |
as well go for it when I did the upgrade, and went 8 gigs. Those 250+ |
44 |
tasks ready-to-run do noticeably load the memory, but only by a gig or |
45 |
two. Often, that won't even push cache off the top of the 8 gig, so as I |
46 |
said, even here, four gigs would still be very reasonable. |
47 |
|
48 |
As for disk access, the average guy with a single hard drive will |
49 |
certainly find that the bottleneck in an unlimited jobs scenario such as |
50 |
the above. I'm running a four disk SATA based RAID array, RAID-6 (so |
51 |
two-way-striped, with two-way recovery as well) on my main system, but |
52 |
full four-way RAID-0/striped for my temp-data stuff, including both the |
53 |
portage tree and kernel sources, and again, while the mouse movement does |
54 |
chop up slightly, and ksysguard does late-update the activity plots during |
55 |
the initial load, there's no way I'd know the system was running 250+ load |
56 |
average if I wasn't actually watching the ksysguard one-minute load average |
57 |
graph. |
58 |
|
59 |
As for portage, no matter the -j setting and despite running |
60 |
$PORTAGE_TMPDIR on tmpfs, as you (Thomas) mention, not a whole lot even |
61 |
keeps the two CPUs busy all the time. In particular, the autotools |
62 |
configure scripts are normally serialized, so single thread only. If I |
63 |
have a lot of updates to do, as when a KDE version refresh comes along, |
64 |
I'll routinely run five merges in parallel, as separate konsole tabs, |
65 |
keeping an emerge --pretend --tree in one tab or a different console |
66 |
window, using the tree layout to keep the dependency trees separate so |
67 |
none of the emerge tabs interferes with the others. |
68 |
|
69 |
Still, it's often just easier to run with a single emerge --update --deep |
70 |
--newuse world and $PORTAGE_NICENESS=20, and just let the update run on |
71 |
one CPU/core, and basically forget about it, going about my normal |
72 |
business as if the update wasn't running. It doesn't take /that/ much |
73 |
longer, as while it's not so efficient at using every bit of CPU, there's |
74 |
less scheduling contention so it's more efficient there, and with |
75 |
dual-core/cpu, $PORTAGE_TMPDIR on tmpfs, and an effective two-way striped |
76 |
(as a four-spindle RAID-6) main system, there's little I/O contention with |
77 |
my regular tasks, so it just runs and I do what I'd do if it weren't |
78 |
running (well, I've not tried burning a CD while doing it, or something |
79 |
like that, but streaming Internet radio doesn't mind), and don't worry |
80 |
about it. |
81 |
|
82 |
All that said, it does seem enough stuff is beginning to be designed with |
83 |
multi-core in mind, that I can actually see a dual-dual-core system being |
84 |
of some use, and am looking forward to that upgrade, both for the clock |
85 |
cycles upgrade (1.6GHz Opteron 242s, 2.6 GHz Opteron 285s) and the |
86 |
dual-cores, giving me four cores total to work with. |
87 |
|
88 |
-- |
89 |
Duncan - List replies preferred. No HTML msgs. |
90 |
"Every nonfree program has a lord, a master -- |
91 |
and if you use the program, he is your master." Richard Stallman |
92 |
|
93 |
-- |
94 |
gentoo-amd64@g.o mailing list |