Gentoo Archives: gentoo-amd64

From: Duncan <1i5t5.duncan@×××.net>
To: gentoo-amd64@l.g.o
Subject: [gentoo-amd64] Re: Identifying CPUs in the kernel
Date: Fri, 22 Jun 2007 16:50:12
Message-Id: pan.2007.06.22.15.02.58@cox.net
In Reply to: [gentoo-amd64] Identifying CPUs in the kernel by Peter Humphrey
1 Peter Humphrey <prh@××××××××××.uk> posted
2 200706221030.26924.prh@××××××××××.uk, excerpted below, on Fri, 22 Jun
3 2007 10:30:26 +0100:
4
5 > This is what happens: when BOINC starts up it starts two processes,
6 > which it thinks are going to occupy up to 100% of each processor's time.
7 > But both gkrellm and top show both processes running at 50% on CPU1,
8 > always that one, with CPU0 idling. Then, if I start an emerge or
9 > something, that divides its time more-or-less equally between the two
10 > processors with the BOINC processes still confined to CPU1.
11 >
12 > Even more confusingly, sometimes top even disagrees with itself about
13 > the processor loadings, the heading lines showing one CPU loaded and the
14 > task lines showing the other.
15 >
16 > Just occasionally, BOINC will start its processes properly, each using
17 > 100% of a CPU, but after a while it reverts spontaneously to its usual
18 > behaviour. I can't find anything in any log to coincide with the
19 > reversion.
20
21 Was it you who posted about this before, or someone else? If it wasn't
22 you, take a look back thru the list a couple months, as it did come up
23 previously. You may have someone to compare notes with. =8^)
24
25 Separate processes or separate threads? Two CPUs (um, two separate
26 sockets) or two cores on the same CPU/socket?
27
28 Some or all of the following you likely already know, but hey, maybe
29 it'll help someone else and it never hurts to throw it in anyway...
30
31 The kernel task scheduler uses CPU affinity, which is supposed to have a
32 variable resistance to switching CPUs, and an preference for keeping a
33 task on the CPU controlling its memory, given a NUMA architecture
34 situation where there's local and remote memory, and a penalty to be paid
35 for access to remote memory.
36
37 There are however differences in architecture between AMD (with its
38 onboard memory controller and closer cooperation, both between cores and
39 between CPUs on separate sockets, due to the direct Hypertransport links)
40 and Intel (with it's off-chip controller and looser inter-core, inter-
41 chip, and inter-socket cooperation). There's also differences in the way
42 you can configure both the memory (thru the BIOS) and the kernel, for
43 separate NUMA access or unified view memory. If these settings don't
44 match your actual physical layout, efficiency will be less than peak,
45 either because there won't be enough resistance to a relatively high cost
46 switching between CPUs/cores and memory, so they'll switch off frequently
47 with little reason, incurring expensive delays each time, or because
48 there's too much resistance and to much favor placed on what the kernel
49 thinks is local vs remote memory, when it's all the same, and there is in
50 fact very little cost to switching cores/CPUs.
51
52 Generally, if you have a single slot true dual core (Intel core-duo or
53 any AMD dual core), you'll run a single memory controller with a single
54 unified view on memory, and costs to switch cores will be relatively
55 low. You'll want to disable NUMA, and configure your kernel with a
56 single scheduling domain.
57
58 If you have multiple slots or the early Intel pseudo-dual-cores, which
59 were really two separate CPUs simply packaged together, with no special
60 cooperation between them, you'll probably want them in separate
61 scheduling domains. If it's AMD with its onboard memory controllers, two
62 sockets means two controllers, and you'll also want to consider NUMA, tho
63 you can disable it and interleave your memory if you wish, for a unified
64 memory view and higher bandwidth, but at the tradeoff of higher latency
65 and less efficient memory access when separate tasks (each running on a
66 CPU) both want to use memory at the same time.
67
68 If you are lucky enough to have four cores, it gets more complex, as
69 currently, four-cores operate as two loosely cooperating pairs, with
70 closer cooperation between cores of the same pair. For highest
71 efficiency there, you'll have two levels of scheduling domain, mirroring
72 the tight local pair-partner cooperation with the rather looser
73 cooperation between pairs.
74
75 In particular, you'll want to pay attention to the following kernel
76 config settings under Processor type and features:
77
78 1) Symmetric multiprocessing support (CONFIG_SMP). You probably have
79 this set right or you'd not be using multiple CPUs/cores.
80
81 2) Under SMP, /possibly/ SMT (CONFIG_SCHED_SMT), tho for Intel only, and
82 on the older Hyperthreading Netburst arch models.
83
84 3) Still under SMP, Multi-core scheduler support (CONFIG_SCHED_MC), if
85 you have true dual cores. Again, note that the first "dual core" Intel
86 units were simply two separate CPUs in the same package, so you probably
87 do NOT want this for them.
88
89 4) Non Uniform Memory Access (NUMA) Support (CONFIG_NUMA). You probably
90 do NOT want this on single-socket multi-cores, and on most Intel
91 systems. You probably DO want this on AMD multi-socket Opteron systems,
92 BUT note that there may be BIOS settings for this as well. It won't work
93 so efficiently if the BIOS setting doesn't agree with the kernel setting.
94
95 5) If you have NUMA support enabled, you'll also want either Old style
96 AMD Opteron NUMA detection (CONFIG_K8_NUMA) or (preferred) ACPI NUMA
97 detection (CONFIG_X86_64_ACPI_NUMA).
98
99 6) Make sure you do *NOT* have NUMA emulation (CONFIG_NUMA_EMU) enabled.
100 As the help for that option says, it's only useful for debugging.
101
102 What I'm wondering, of course, is whether you have NUMA turned on when
103 you shouldn't, or don't have core scheduling turned on when you should,
104 thus artificially increasing the resistance to switching cores/cpus and
105 causing the stickiness.
106
107
108 Now for the process vs. thread stuff. With NUMA turned on, especially if
109 core scheduling is turned off, threads of the same app, accessing the
110 same memory, will be more likely to be scheduled on the same processor.
111 I don't know anything that will allow specifying processor per-thread, at
112 least with the newer NPTL (Native POSIX Thread Library) threading. With
113 the older Linux threads model, each thread showed up as a separate
114 process, with its own PID, and could therefore be accessed separately by
115 the various scheduling tools.
116
117 If however, you were correct when you said BOINC starts two separate
118 /processes/ (not threads), or if BOINC happens to use the older/heavier
119 Linux threads model (which again will cause the threads to show up as
120 separate processes), THEN you are in luck! =8^)
121
122 There are two scheduling utility packages that include utilities to tie
123 processes to one or more specific processors.
124
125 sys-process/schedutils is what I have installed. It's a collection of
126 separate utilities, including taskset, by which I can tell the kernel
127 which CPUs I want specific processes to run on. This worked well for me
128 since I was more interested in taskset than the other included utilities,
129 and only had to learn the single simple command. It does what I need it
130 to do, and does it well. =8^)
131
132 If you prefer a single do-it-all scheduler-tool, perhaps easier to learn
133 if you plan to fiddle with more than simply which CPU a process runs on,
134 and want to learn it all at once, sys-process/schedtool may be more your
135 style.
136
137 Hope that's of some help, even if part or all of it is review.
138
139 --
140 Duncan - List replies preferred. No HTML msgs.
141 "Every nonfree program has a lord, a master --
142 and if you use the program, he is your master." Richard Stallman
143
144 --
145 gentoo-amd64@g.o mailing list

Replies

Subject Author
Re: [gentoo-amd64] Re: Identifying CPUs in the kernel Peter Humphrey <prh@××××××××××.uk>
Re: [gentoo-amd64] Re: Identifying CPUs in the kernel Joshua Hoblitt <jhoblitt@××××××××××.edu>