1 |
Peter Humphrey <prh@××××××××××.uk> posted |
2 |
200706221910.44194.prh@××××××××××.uk, excerpted below, on Fri, 22 Jun |
3 |
2007 19:10:44 +0100: |
4 |
|
5 |
>> What I'm wondering, of course, is whether you have NUMA turned on when |
6 |
>> you shouldn't, or don't have core scheduling turned on when you should, |
7 |
>> thus artificially increasing the resistance to switching cores/cpus and |
8 |
>> causing the stickiness. |
9 |
> |
10 |
> I don't think so. |
11 |
|
12 |
Yeah, now that you've clarified that it's sockets and confirmed settings, |
13 |
you seem to have it right. |
14 |
|
15 |
On the BIOS settings, some of them will affect whether the board can use |
16 |
all four gigs memory as well, by controlling how it arranges the address |
17 |
space and whether there's a hole left between 3.5 and 4 gig for 32-bit |
18 |
PCI hardware addressing or not. I've a similar arrangement here on a |
19 |
Tyan s2885, only with two-gig sticks so 8 gig memory. If you are seeing |
20 |
your full 4 gig memory, tho, you've got that set right, both in the |
21 |
kernel and in the BIOS. |
22 |
|
23 |
The other BIOS settings of interest here are the access bitness/ |
24 |
interleaving. If it's like mine, you'll be able to set 32-, 64-, or 128- |
25 |
bit interleaving. You'll want 64-bit, interleaving the sticks in the |
26 |
node for best bandwidth there, but not the nodes, so they can be used |
27 |
NUMA. In ordered to actually get the 64-bit interleaved access, you'll |
28 |
need the sticks in paired slots on the node, however. (1&2 or 3&4, not |
29 |
2&3 or separated.) But it sounds like you have that as well. |
30 |
|
31 |
Finally, there's the question of how the rest of the system connects to |
32 |
the sockets. Here, everything except memory all connects to the first |
33 |
socket (CPU0), so the system can run in single socket mode. However, |
34 |
that means anything doing heavy I/O or the like, including 3D video |
35 |
access, runs most efficient on CPU0. In particular, using taskset |
36 |
(mentioned in what I snipped), I've noticed that even in 2D mode but with |
37 |
Composite on, X takes several percentage points more CPU when it's |
38 |
scheduled on CPU1 than it does when it's allowed to run on CPU0. CPU1 |
39 |
works best with CPU or hard drive or other comparatively slow I/O bound |
40 |
processes, the former since it doesn't matter which CPU for them, the |
41 |
latter since the I/O is slow enough it's the bottleneck in any case. If |
42 |
your board is laid out similarly, when you are playing around with |
43 |
taskset or the like, it's worth keeping that in mind. |
44 |
|
45 |
If as you say, BOINC is running separate processes, than scheduling it |
46 |
with taskset should be possible and do what you need to do. The only |
47 |
caveat would be if the processes terminate and restart. You may need to |
48 |
hack a script up to run from cron, to check every minute or 10 or |
49 |
whatever, depending on how long the BOINC tasks last, to keep them |
50 |
scheduled on separate CPUs. I have a particular game (Master of Orion |
51 |
original, the only non-source based software I still run) I run in DOSBOX |
52 |
emulation. Mainly, I use taskset to set DOSBOX on CPU1, while X and |
53 |
anything else I'm running that uses significant CPU gets put on CPU0. |
54 |
That works VERY well, and has allowed me to increase the emulation speed |
55 |
dramatically over that possible before, when X and DOSBOX may have been |
56 |
running on the same CPU. That's the big thing I use taskset for, but it |
57 |
works quite well for it. =8^) |
58 |
|
59 |
-- |
60 |
Duncan - List replies preferred. No HTML msgs. |
61 |
"Every nonfree program has a lord, a master -- |
62 |
and if you use the program, he is your master." Richard Stallman |
63 |
|
64 |
-- |
65 |
gentoo-amd64@g.o mailing list |