1 |
Mark Knecht posted on Mon, 01 Feb 2010 06:42:24 -0800 as excerpted: |
2 |
|
3 |
> On the new machine I've set it up as dual-head which is working nicely |
4 |
> for all the basic stuff, and in general pretty nicely for running |
5 |
> VMWare/WinXP on the second screen for most of the day. I have a few |
6 |
> issues, like the mouse can become __very__ laggy in VMWare at times |
7 |
> but other than that all the basics are there and working well enough |
8 |
> to get some work done. I'm using XFCE4 at the moment. |
9 |
|
10 |
I wouldn't do vmware as it's servantware, and thus don't know a /whole/ |
11 |
lot about it, but here's a bit of general wisdom on lagginess/latency |
12 |
issues. Was it you that did a bunch of sound related stuff? If so, you |
13 |
likely know (and have it set as appropriate) some of this already. |
14 |
|
15 |
First, what's your kernel tick time set for, 100, 250, 300, 1000 ticks per |
16 |
second? Obviously higher ticks will help with latency, but it negatively |
17 |
affects thruput. Also note that with SMP (multiple CPUs/cores), each one |
18 |
ticks at that, so you can often turn down the ticks a notch or two from |
19 |
what you'd normally have to run, if you're running SMP. |
20 |
|
21 |
Second, what's your kernel preemption choice? No-preemption/server, |
22 |
voluntary-preemption/desktop, or full-preemption/low-latency-desktop? |
23 |
Again, there's a trade-off between latency and thruput. If you're worried |
24 |
about mouse lagginess, server isn't appropriate, but you can choose from |
25 |
the other two. |
26 |
|
27 |
Third, there's additional low-latency kernel patches available... I'll |
28 |
leave that alone as I run vanilla kernel. |
29 |
|
30 |
Fourth, there's I/O scheduling. Due to the way I/O works, often, the |
31 |
kernel stops doing much of whatever else it was doing when it's handling |
32 |
I/O. What I/O scheduler are you running, and have you noted the disk |
33 |
activity LEDs blinking furiously (or conversely, no disk activity at all) |
34 |
during your latency? How's your memory situation? How far into swap do |
35 |
you typically run? Do you run /tmp and/or /var/tmp on tmpfs? |
36 |
Particularly when you're emerging stuff in the background, having |
37 |
PORTAGE_TMPDIR pointed at a tmpfs can make a pretty big difference, both |
38 |
in emerge speed, and in system responsiveness, because there's much less |
39 |
I/O that way. That's assuming, of course, that you have at least a couple |
40 |
gigs of memory and aren't already starved for memory with your typical |
41 |
application load. |
42 |
|
43 |
Fifth, priority. Have you tried either higher priority for the vmware |
44 |
stuff or lower priority for other things, portage, anything else that may |
45 |
be hogging CPU? (For portage, I like to set PORTAGE_NICENESS=19, which |
46 |
automatically sets scheduler batch mode for it as well. The priority is |
47 |
as low as possible so it doesn't interfere with other things to the extent |
48 |
possible, while the batch mode means it gets longer timeslices, too, thus |
49 |
making it more efficient with what it does get.) |
50 |
|
51 |
The above, save for priority, is mostly kernel related, so should have an |
52 |
effect regardless of whether your vmware vm is mostly kernel or userland |
53 |
implementation. The below is mostly for userland so won't work as well if |
54 |
vmware is mostly kernel. I don't know. |
55 |
|
56 |
Sixth, are you using user-group or control-group (aka cgroup) kernel |
57 |
scheduling, or not, and how do you have it configured? The kernel options |
58 |
are under general setup. Cgroup scheduling gets rather complicated, but |
59 |
user-group scheduling is reasonably easy to configure, and it can make a |
60 |
**BIG** difference on a highly loaded system. Thus, I'd suggest user- |
61 |
group scheduling. |
62 |
|
63 |
To enable user-group scheduling, enable Group CPU scheduler, and |
64 |
(normally) Group scheduling for SCHED_OTHER, which is everything OTHER |
65 |
than real-time threads. I leave the scheduling for SCHED_RR/FIFO off, as |
66 |
unless you know what you are doing and have specific reason to mess with |
67 |
real-time scheduling, it's best NOT to mess with it, because it's a VERY |
68 |
easy way to seriously screw your system! |
69 |
|
70 |
Again, you probably do NOT want to mess with control group support, unless |
71 |
you have specific needs beyond what user-group scheduling will do for you, |
72 |
because that gets quite complicated. Therefore, leave that option off, |
73 |
and under Basis for grouping tasks, make sure it says "(user id)". |
74 |
That'll be the only option unless you have control group support enabled. |
75 |
|
76 |
Now, how do you use it? Simple. For each user currently running at least |
77 |
one application, there's a /sys dir with the user id number (not name, |
78 |
number, you need to know the number), /sys/kernel/uids/<uid>. In this |
79 |
directory, there's a file, cpu_share. |
80 |
|
81 |
The contents of this file is the relative CPU share the user will get, |
82 |
compared to other users, when the system is under load and thus has to |
83 |
ration CPU time. The default share for all users save for root is 1024. |
84 |
Root's default share is double that, 2048. |
85 |
|
86 |
So here's how it works. With user-group scheduling enabled, instead of |
87 |
priority alone determining scheduling, now priority and user determine |
88 |
scheduling. Once the system is under load so it matters, no user can take |
89 |
more than their share, regardless of what priority their apps are running |
90 |
at. If you want a particular user to get more time, double its share. If |
91 |
you want to restrict a user, half its share. Just keep in mind that root |
92 |
has a 2048 share by default, so it's wise to be a bit cautious about |
93 |
increasing too many users up to that or beyond unless you boost root as |
94 |
well, just to be sure. Various system housekeeping threads, kernel |
95 |
threads, etc, use time from the root share, so you want to be a bit |
96 |
careful about increasing other users above it, or the housekeeping |
97 |
threads, disk syncs, etc, might not have the time to run that they need. |
98 |
However, increasing just one single user to say 4096 shouldn't starve root |
99 |
too badly even if that user gets a runaway app, as root will still be |
100 |
getting half that time, as long as everything else remains at 1024 or |
101 |
below. But obviously, you won't want to put say the portage user at 4096! |
102 |
|
103 |
I routinely bump my normal user to 2048 along with root, when I'm running |
104 |
emerges, etc. This is with FEATURES="userfetch userpriv usersync" among |
105 |
others, so portage is spending most of its time as the portage user, thus |
106 |
with its default 1024 share. Boosting my normal user to 2048 thus ensures |
107 |
that it (along with root) gets twice the time that the portage user does, |
108 |
but even should one of my normal user apps go into runaway, root still |
109 |
gets nearly half the CPU (more precisely, just under 40%, since root and |
110 |
my normal user would each be getting double the portage user, with other |
111 |
users not taking much as they'd not be in runaway, so root and the normal |
112 |
user would get nearly 40% each, while portage would get nearly 20%, with |
113 |
perhaps the other non-runaway users taking a percent or two, thus the |
114 |
"nearly") if it needs it, which should be plenty to login as root and kill |
115 |
something, if I have to, or to shut down the system in an orderly way, or |
116 |
do whatever else I'd need to do. |
117 |
|
118 |
Even if I were to run my normal user at 4096 and it would have a runaway, |
119 |
it would get 4 shares, portage would get one, and root would get two, so |
120 |
even then, root would get nearly 2/7 or about 28% share, with the runaway |
121 |
user getting double that or about 56% and portage getting about 14%. Even |
122 |
28% share for root should be enough, so that's reasonably safe. However, |
123 |
I'd be extremely cautious about going over 4096, or increasing a second |
124 |
user's share to that too, unless I increased root's share as well. |
125 |
|
126 |
That's actually simplifying it some, tho, as the above assumes all the CPU |
127 |
hogs are running at normal 0 priority/niceness. But as I mentioned, I |
128 |
have PORTAGE_NICENESS=19, so it's running at idle priority, which would |
129 |
lower its claim to the portage user share dramatically. Basically, at |
130 |
idle priority, it'd get very little share if there was another run away |
131 |
(normal priority) process, as ANY user. (The scheduler /does/ normally |
132 |
give /every/ process at least one timeslice per scheduling period, even at |
133 |
idle priority, to prevent priority inversion situations in case of lock |
134 |
contention and the like.) So the above percentage scenarios would be more |
135 |
like 48/48/1/3 (root/user/portage/other) in the 2048/2048/1024s case, and |
136 |
32/65/1/2 in the 2048/4096/1024s case. Basically, the portage user, even |
137 |
tho it's using all the CPU it can get, would still fall into the noise |
138 |
range along with other users, because it's running at idle priority, and |
139 |
root would thus get close to half or close to a third of the CPU, with the |
140 |
normal user at equal share or double share of root, respectively. |
141 |
|
142 |
I've had very good results using that setup. Just for curiosity' sake, I |
143 |
tried running ridiculous numbers of make jobs, to see how the system |
144 |
handled it. With this setup (PORTAGE_NICENESS=19, portage user at 1024 |
145 |
share, root at 2048, and normal user at either 1024 or 2048), I can |
146 |
increase make jobs without limit and still keep a reasonably usable |
147 |
system, as long as the memory stays under control. This is MUCH more so |
148 |
than simply running PORTAGE_NICENESS=19 but without per-user scheduling |
149 |
enabled. In practice, therefore, the limit on make jobs is no longer CPU |
150 |
scheduling, but the amount of memory each job uses. I set my number of |
151 |
make jobs so that I don't go into swap much, if at all, even with |
152 |
PORTAGE_TMPDIR pointed at tmpfs. Because swapping is I/O, and I/O, due to |
153 |
the way the hardware works, increases latency, sometimes unacceptably. |
154 |
|
155 |
Actually, my biggest thread test has been compiling the kernel, since it's |
156 |
so easily parallellizable, to a load average of several hundred if you let |
157 |
it, without using gigs and gigs of memory (yes it takes some, but nowhere |
158 |
near what a typical compile would take at that number of parallel jobs) to |
159 |
do it. I do my kernel compiles as yet another user (what I call my |
160 |
"admin" user), so like portage, it gets the default 1024 share. But I |
161 |
don't set niceness so it's running at normal 0 priority and taking its |
162 |
full share against other 0 priority users. Compiling the kernel, I can |
163 |
easily run over a hundred parallel make jobs without seriously stressing |
164 |
the system. But even there, with user scheduling enabled and at normal |
165 |
priority so the kernel compile is taking all the share it can, the memory |
166 |
requirements are the bottleneck, not the actual jobs or load average, |
167 |
because the kernel per-user scheduling is doing its job, giving my other |
168 |
non-hog users and root the share they need to continue running normally. |
169 |
|
170 |
So assuming vmware is running in userspace and thus affected by priority |
171 |
and user-groups, I'd definitely recommend setting up user-groups and |
172 |
fiddling with its share, as well as that of the rest of the system, along |
173 |
with the other steps above. |
174 |
|
175 |
The one caveat with user-group scheduling, is that the |
176 |
/sys/kernel/uids/<uid> directories are created and destroyed dynamically, |
177 |
as apps run and terminate as those users. It's thus not (easily) possible |
178 |
to set a static policy, whereby a particular UID /always/ gets a specific |
179 |
non-default share. There was a writeup I read back when the feature was |
180 |
first introduced, that was supposed to explain how to set up an automatic |
181 |
handler such that every time a particular UID appeared, it'd write a |
182 |
particular value to its cpu_share file, but as best I could tell, the |
183 |
writeup was already out of date, as the scripts that were supposed to be |
184 |
called automatically according to the writeup, were never called. The |
185 |
kernel hotplugging (and/or udev, or whatever was handling it) had changed |
186 |
out from under the documentation, even as the feature was going thru the |
187 |
process of getting the peer approval necessary to be added to the mainline |
188 |
kernel. So I've never had an automatic policy setup to do it. It could |
189 |
certainly be done using a file-watch (fnotify/dnotify, etc, or polling of |
190 |
some sort), without relying on the hotplugging mechanism that was supposed |
191 |
to work that I could never get to work, but I've not bothered. I've |
192 |
simply created scripts to echo the desired numbers into the desired files |
193 |
when I invoke them, and run them manually when I need to. That has worked |
194 |
well enough for my needs. |
195 |
|
196 |
(Now you see why I'm not going into cgroups? user-group scheduling is |
197 |
actually quite simple. Imagine the length of the post if I was trying to |
198 |
explain cgroups!) |
199 |
|
200 |
> The one place where I've been a bit disappointed is when the VGA |
201 |
> drivers need to switch resolutions to play a game like Tux Racer then |
202 |
> instead of two desktops I'm seeing one desktop duplicated on both |
203 |
> monitors. Is this normal or is there some general way to control this? |
204 |
> I'd really like the game on one monitor and just have the other stay |
205 |
> black. |
206 |
|
207 |
The problem here is that most resolution switchers simply assume a single |
208 |
monitor. Before the X RandR extension, there was really no standard way |
209 |
to reliably handle multiple monitor setups (xinerama, merged-framebuffer, |
210 |
and proprietary or semi-proprietary methods like that used by the nvidia |
211 |
and frglx drivers, were all in use at various times by various hardware/ |
212 |
drivers, and for all I know there were others as well), so assuming a |
213 |
single monitor was pretty much the best they could do. |
214 |
|
215 |
RandR has solved the standardization problem, but few games have upgraded |
216 |
to it, in part because it's apparently "rocket science" to properly |
217 |
program it. The xrandr CLI client works, but all too often, the X |
218 |
environment tools are simply broken. KDE for example has had a tool |
219 |
that's supposed to handle multiple monitors, changing resolutions, etc, |
220 |
for some time, but on all three sets of hardware and drivers I've tried it |
221 |
on, both the kde3 version and the kde4 version thru 4.3.4, it has screwed |
222 |
things up royally if you're using more than one monitor. Only xorg's own |
223 |
xrandr gets it right, and that's a CLI client, with a slew of options to |
224 |
read about and try to master, before you can properly run it. I've |
225 |
scripted a solution here using it, hard-coding some of the options I don't |
226 |
change into the script (could be a config file) thus making my script |
227 |
simple enough to run from the command line (or invoke from a menu entry) |
228 |
without having to remember all the complicated syntax, but that's not |
229 |
going to work for the CLI-phobic. And if the X environments can't get it |
230 |
working correctly for many users even with the documentation and the |
231 |
xrandr code to follow, what are the games folks supposed to do? So they |
232 |
simply continue to assume only a single monitor... and screw things up for |
233 |
those of us with more than one, at least if we prefer to run them in other |
234 |
than clone mode. |
235 |
|
236 |
Because that's what's happening. When the games, etc, trigger the old |
237 |
single-monitor resolution change API, it causes xorg to switch to clone |
238 |
mode, running all monitors at the same resolution, showing the same thing. |
239 |
|
240 |
FWIW, the solution I've found, as I mentioned, is a script setup to invoke |
241 |
my preferred resolutions, in my preferred non-clone modes, retaining my |
242 |
preferred "stacked" monitor orientation, by invoking xrandr with the |
243 |
appropriate parameters to do so. |
244 |
|
245 |
Thus I use my script (which uses xrandr) to set the resolution I want, and |
246 |
set the game not to change resolution -- to run in a window or whatever, |
247 |
instead. I run kde, and with kwin's per-app and per-window config |
248 |
options, I set it up to always put the windows for specific games at |
249 |
specific locations, sometimes without window borders etc. Between that |
250 |
and triggering the resolution settings I want with my xrandr script, I can |
251 |
get the game running in a window, but that window set to exactly the size |
252 |
and at exactly the location of the monitor I want it to run on, while the |
253 |
other monitor continues at its configured size and showing the desktop or |
254 |
apps it normally shows. |
255 |
|
256 |
"Works for me!" =:^) |
257 |
|
258 |
> More disturbing is when I exit the game I'm left with both desktops |
259 |
> displaying the same things and neither is exactly my original first or |
260 |
> second desktop but rather a combination of the two which is fairly |
261 |
> strange. (Desktop #1 icons with Desktop #2 wallpaper) |
262 |
|
263 |
Both desktops (monitors, I assume, that's quite different from virtual |
264 |
desktops, which is how I'd normally use the term "desktop") displaying the |
265 |
same thing is simply clone mode. You can try using your X environment |
266 |
resolution tool (if xfce has such a thing, kde does and I think gnome |
267 |
does) to switch back to what you want, but as I said, don't be surprised |
268 |
if it doesn't work as expected, because they've really had problems |
269 |
getting the things working right. xrandr gets it right, and you'd /think/ |
270 |
they could if /nothing/ else read its code and use similar tricks, it /is/ |
271 |
open source, after all, but kde certainly hasn't gotten it right, at least |
272 |
not for many drivers and hardware, and from what I've read, gnome's |
273 |
version isn't a lot better. |
274 |
|
275 |
But, if you're up for a bit of reading, you can figure out how xrandr |
276 |
works well enough to get it to do what you want. Here's an example, |
277 |
actually the debug output of the script I run, showing the xrandr command |
278 |
as it's setup and invoked by the script (all one command line): |
279 |
|
280 |
xrandr --verbose --fb 1920x2400 --output DVI-0 --mode 1920x1200 --panning |
281 |
1920x1200+0+0/1920x1200+0+0/20/20/20/20 --output DVI-1 --mode 1920x1200 -- |
282 |
panning 1920x1200+0+1200/1920x1200+0+1200/20/20/20/20 |
283 |
|
284 |
That results in: |
285 |
|
286 |
1) an overall framebuffer resolution of 1920x2400 |
287 |
|
288 |
2) output DVI-0 being set to resolution 1920x1200, with its top-left |
289 |
corner at position 0,0. |
290 |
|
291 |
3) output DVI-1 being set to a similar resolution (I have two of the same |
292 |
model of monitor, 1920x1200 native resolution), but with its top-left |
293 |
corner at position 0,1200, thus, directly under DVI-0. |
294 |
|
295 |
The panning mode stuff (except for the positioning bit) wouldn't be |
296 |
necessary here as there's no panning to do, but those are the script |
297 |
defaults. For use of panning mode, see this one: |
298 |
|
299 |
xrandr --verbose --fb 1920x2400 --output DVI-0 --mode 1280x800 --panning |
300 |
1920x1200+0+0/1920x1200+0+0/20/20/20/20 --output DVI-1 --mode 1280x800 -- |
301 |
panning 1920x1200+0+1200/1920x1200+0+1200/20/20/20/20 |
302 |
|
303 |
This keeps the same overall framebuffer size and output orientation |
304 |
(stacked), but the outputs are both run at 1280x800, with the panning |
305 |
domain set for each one such that as the mouse gets to 20 px from any edge |
306 |
of the 1280x800 viewport, it moves the viewport within the corresponding |
307 |
1920x1200 panning domain. |
308 |
|
309 |
Here's one with different resolutions, and with panning when the mouse |
310 |
reaches the edge itself (instead of 20 px in) on the lower resolution and |
311 |
position one (DVI-1), so I can run a game there, without it trying to pan |
312 |
out as near the edge. I then put the viewport over the game and let the |
313 |
game grab the mouse, so I can then play the game without having to worry |
314 |
about panning. If I need to, I can have it "ungrab" the mouse, and have |
315 |
panning again on the lower one, or move to the "fixed" upper one and do |
316 |
stuff there. |
317 |
|
318 |
xrandr --verbose --fb 1920x2400 --output DVI-0 --mode 1920x1200 --panning |
319 |
1920x1200+0+0/1920x1200+0+0/20/20/20/20 --output DVI-1 --mode 960x600 |
320 |
--panning 1920x1200+0+1200/1920x1200+0+1200/ |
321 |
|
322 |
When I'm finished with the game, or if I want to run normal resolution and |
323 |
do something else for a bit, I just run that first command again, and it |
324 |
returns me to normal mode as set by that first command. |
325 |
|
326 |
Unfortunately, kde4 still has a few bugs with multiple monitors, |
327 |
especially when switching resolutions. As mentioned, the kde4 resolution |
328 |
switcher itself is entirely screwed up as all it can handle is clone mode |
329 |
(there's no way to set separate non-identical top-left corners for each |
330 |
monitor), but there's bugs with the plasma desktop as well. If I do |
331 |
happen to select clone mode, or disable one of the monitors using xrandr, |
332 |
upon return to normal mode, plasma-desktop is screwed up. I can fix it |
333 |
without restarting X/kde, but it's a hassle to do so, and somewhat trial |
334 |
and error, zooming in and out the various plasma "activities", until I get |
335 |
it setup correctly once again. Hopefully, 4.4 has improved that as well. |
336 |
I read it has. We'll see... |
337 |
|
338 |
> I'm wondering if other environments handle this better. XFCE is |
339 |
> pretty lightweight, which I like. I'd gone away from Gnome because of |
340 |
> the time spent maintaining it on Gentoo but on this machine it probably |
341 |
> wouldn't be all the bad. Not sure I want KDE but I'm curious as to |
342 |
> whether anything solves this problem? |
343 |
|
344 |
Well... kde 3 worked reasonably well in this regard (except its resolution |
345 |
switcher wasn't much good either, I used X's ctrl-alt-numplus/numminus |
346 |
zooming while it worked, then developed the xrandr scripts I still use |
347 |
today when x switched to randr based switching and the numplus/numminus |
348 |
zooming didn't work any more, but the desktop at least stayed put), but as |
349 |
you can tell, I'm rather frustrated with kde4. |
350 |
|
351 |
But definitely try xrandr. It's a pain to learn as it's all CLI options |
352 |
not point and click, but it's remarkably good at doing what it does, once |
353 |
you know how to run it, and possibly hack up a script or several to take |
354 |
the complexity out of it. |
355 |
|
356 |
> Logging out of XCFE and then running startx gets everything back |
357 |
> the way I want, and I don't think I'll play Linux games much, but I'm |
358 |
> curious as to how well other environments handle this. |
359 |
|
360 |
As explained, the base problem is that games assume single monitor, which |
361 |
X construes as a command to go into clone mode. The solution is to use an |
362 |
external app (such as the xrandr invoking scripts I use) to set the |
363 |
resolutions you want, and don't invoke the games' options to change |
364 |
resolution or whatever, just have them run in a window. Then match the |
365 |
window size to your desired resolution (enforcing it using your window |
366 |
manager, if that's more convenient or necessary), and invoke the script |
367 |
(or other external to the game resolution switcher app) changing the |
368 |
resolution right before you run the game. |
369 |
|
370 |
Alternatively, since we're talking about a script already, you could set |
371 |
it up so the script runs xrandr to change the resolution as desired, then |
372 |
runs the game, then when the game is done, changes the resolution back. |
373 |
|
374 |
-- |
375 |
Duncan - List replies preferred. No HTML msgs. |
376 |
"Every nonfree program has a lord, a master -- |
377 |
and if you use the program, he is your master." Richard Stallman |