Gentoo Archives: gentoo-amd64

From: Duncan <1i5t5.duncan@×××.net>
To: gentoo-amd64@l.g.o
Subject: [gentoo-amd64] Re: How well does your dual-head window manager handle games?
Date: Tue, 02 Feb 2010 12:05:15
Message-Id: pan.2010.02.02.11.36.44@cox.net
In Reply to: [gentoo-amd64] How well does your dual-head window manager handle games? by Mark Knecht
1 Mark Knecht posted on Mon, 01 Feb 2010 06:42:24 -0800 as excerpted:
2
3 > On the new machine I've set it up as dual-head which is working nicely
4 > for all the basic stuff, and in general pretty nicely for running
5 > VMWare/WinXP on the second screen for most of the day. I have a few
6 > issues, like the mouse can become __very__ laggy in VMWare at times
7 > but other than that all the basics are there and working well enough
8 > to get some work done. I'm using XFCE4 at the moment.
9
10 I wouldn't do vmware as it's servantware, and thus don't know a /whole/
11 lot about it, but here's a bit of general wisdom on lagginess/latency
12 issues. Was it you that did a bunch of sound related stuff? If so, you
13 likely know (and have it set as appropriate) some of this already.
14
15 First, what's your kernel tick time set for, 100, 250, 300, 1000 ticks per
16 second? Obviously higher ticks will help with latency, but it negatively
17 affects thruput. Also note that with SMP (multiple CPUs/cores), each one
18 ticks at that, so you can often turn down the ticks a notch or two from
19 what you'd normally have to run, if you're running SMP.
20
21 Second, what's your kernel preemption choice? No-preemption/server,
22 voluntary-preemption/desktop, or full-preemption/low-latency-desktop?
23 Again, there's a trade-off between latency and thruput. If you're worried
24 about mouse lagginess, server isn't appropriate, but you can choose from
25 the other two.
26
27 Third, there's additional low-latency kernel patches available... I'll
28 leave that alone as I run vanilla kernel.
29
30 Fourth, there's I/O scheduling. Due to the way I/O works, often, the
31 kernel stops doing much of whatever else it was doing when it's handling
32 I/O. What I/O scheduler are you running, and have you noted the disk
33 activity LEDs blinking furiously (or conversely, no disk activity at all)
34 during your latency? How's your memory situation? How far into swap do
35 you typically run? Do you run /tmp and/or /var/tmp on tmpfs?
36 Particularly when you're emerging stuff in the background, having
37 PORTAGE_TMPDIR pointed at a tmpfs can make a pretty big difference, both
38 in emerge speed, and in system responsiveness, because there's much less
39 I/O that way. That's assuming, of course, that you have at least a couple
40 gigs of memory and aren't already starved for memory with your typical
41 application load.
42
43 Fifth, priority. Have you tried either higher priority for the vmware
44 stuff or lower priority for other things, portage, anything else that may
45 be hogging CPU? (For portage, I like to set PORTAGE_NICENESS=19, which
46 automatically sets scheduler batch mode for it as well. The priority is
47 as low as possible so it doesn't interfere with other things to the extent
48 possible, while the batch mode means it gets longer timeslices, too, thus
49 making it more efficient with what it does get.)
50
51 The above, save for priority, is mostly kernel related, so should have an
52 effect regardless of whether your vmware vm is mostly kernel or userland
53 implementation. The below is mostly for userland so won't work as well if
54 vmware is mostly kernel. I don't know.
55
56 Sixth, are you using user-group or control-group (aka cgroup) kernel
57 scheduling, or not, and how do you have it configured? The kernel options
58 are under general setup. Cgroup scheduling gets rather complicated, but
59 user-group scheduling is reasonably easy to configure, and it can make a
60 **BIG** difference on a highly loaded system. Thus, I'd suggest user-
61 group scheduling.
62
63 To enable user-group scheduling, enable Group CPU scheduler, and
64 (normally) Group scheduling for SCHED_OTHER, which is everything OTHER
65 than real-time threads. I leave the scheduling for SCHED_RR/FIFO off, as
66 unless you know what you are doing and have specific reason to mess with
67 real-time scheduling, it's best NOT to mess with it, because it's a VERY
68 easy way to seriously screw your system!
69
70 Again, you probably do NOT want to mess with control group support, unless
71 you have specific needs beyond what user-group scheduling will do for you,
72 because that gets quite complicated. Therefore, leave that option off,
73 and under Basis for grouping tasks, make sure it says "(user id)".
74 That'll be the only option unless you have control group support enabled.
75
76 Now, how do you use it? Simple. For each user currently running at least
77 one application, there's a /sys dir with the user id number (not name,
78 number, you need to know the number), /sys/kernel/uids/<uid>. In this
79 directory, there's a file, cpu_share.
80
81 The contents of this file is the relative CPU share the user will get,
82 compared to other users, when the system is under load and thus has to
83 ration CPU time. The default share for all users save for root is 1024.
84 Root's default share is double that, 2048.
85
86 So here's how it works. With user-group scheduling enabled, instead of
87 priority alone determining scheduling, now priority and user determine
88 scheduling. Once the system is under load so it matters, no user can take
89 more than their share, regardless of what priority their apps are running
90 at. If you want a particular user to get more time, double its share. If
91 you want to restrict a user, half its share. Just keep in mind that root
92 has a 2048 share by default, so it's wise to be a bit cautious about
93 increasing too many users up to that or beyond unless you boost root as
94 well, just to be sure. Various system housekeeping threads, kernel
95 threads, etc, use time from the root share, so you want to be a bit
96 careful about increasing other users above it, or the housekeeping
97 threads, disk syncs, etc, might not have the time to run that they need.
98 However, increasing just one single user to say 4096 shouldn't starve root
99 too badly even if that user gets a runaway app, as root will still be
100 getting half that time, as long as everything else remains at 1024 or
101 below. But obviously, you won't want to put say the portage user at 4096!
102
103 I routinely bump my normal user to 2048 along with root, when I'm running
104 emerges, etc. This is with FEATURES="userfetch userpriv usersync" among
105 others, so portage is spending most of its time as the portage user, thus
106 with its default 1024 share. Boosting my normal user to 2048 thus ensures
107 that it (along with root) gets twice the time that the portage user does,
108 but even should one of my normal user apps go into runaway, root still
109 gets nearly half the CPU (more precisely, just under 40%, since root and
110 my normal user would each be getting double the portage user, with other
111 users not taking much as they'd not be in runaway, so root and the normal
112 user would get nearly 40% each, while portage would get nearly 20%, with
113 perhaps the other non-runaway users taking a percent or two, thus the
114 "nearly") if it needs it, which should be plenty to login as root and kill
115 something, if I have to, or to shut down the system in an orderly way, or
116 do whatever else I'd need to do.
117
118 Even if I were to run my normal user at 4096 and it would have a runaway,
119 it would get 4 shares, portage would get one, and root would get two, so
120 even then, root would get nearly 2/7 or about 28% share, with the runaway
121 user getting double that or about 56% and portage getting about 14%. Even
122 28% share for root should be enough, so that's reasonably safe. However,
123 I'd be extremely cautious about going over 4096, or increasing a second
124 user's share to that too, unless I increased root's share as well.
125
126 That's actually simplifying it some, tho, as the above assumes all the CPU
127 hogs are running at normal 0 priority/niceness. But as I mentioned, I
128 have PORTAGE_NICENESS=19, so it's running at idle priority, which would
129 lower its claim to the portage user share dramatically. Basically, at
130 idle priority, it'd get very little share if there was another run away
131 (normal priority) process, as ANY user. (The scheduler /does/ normally
132 give /every/ process at least one timeslice per scheduling period, even at
133 idle priority, to prevent priority inversion situations in case of lock
134 contention and the like.) So the above percentage scenarios would be more
135 like 48/48/1/3 (root/user/portage/other) in the 2048/2048/1024s case, and
136 32/65/1/2 in the 2048/4096/1024s case. Basically, the portage user, even
137 tho it's using all the CPU it can get, would still fall into the noise
138 range along with other users, because it's running at idle priority, and
139 root would thus get close to half or close to a third of the CPU, with the
140 normal user at equal share or double share of root, respectively.
141
142 I've had very good results using that setup. Just for curiosity' sake, I
143 tried running ridiculous numbers of make jobs, to see how the system
144 handled it. With this setup (PORTAGE_NICENESS=19, portage user at 1024
145 share, root at 2048, and normal user at either 1024 or 2048), I can
146 increase make jobs without limit and still keep a reasonably usable
147 system, as long as the memory stays under control. This is MUCH more so
148 than simply running PORTAGE_NICENESS=19 but without per-user scheduling
149 enabled. In practice, therefore, the limit on make jobs is no longer CPU
150 scheduling, but the amount of memory each job uses. I set my number of
151 make jobs so that I don't go into swap much, if at all, even with
152 PORTAGE_TMPDIR pointed at tmpfs. Because swapping is I/O, and I/O, due to
153 the way the hardware works, increases latency, sometimes unacceptably.
154
155 Actually, my biggest thread test has been compiling the kernel, since it's
156 so easily parallellizable, to a load average of several hundred if you let
157 it, without using gigs and gigs of memory (yes it takes some, but nowhere
158 near what a typical compile would take at that number of parallel jobs) to
159 do it. I do my kernel compiles as yet another user (what I call my
160 "admin" user), so like portage, it gets the default 1024 share. But I
161 don't set niceness so it's running at normal 0 priority and taking its
162 full share against other 0 priority users. Compiling the kernel, I can
163 easily run over a hundred parallel make jobs without seriously stressing
164 the system. But even there, with user scheduling enabled and at normal
165 priority so the kernel compile is taking all the share it can, the memory
166 requirements are the bottleneck, not the actual jobs or load average,
167 because the kernel per-user scheduling is doing its job, giving my other
168 non-hog users and root the share they need to continue running normally.
169
170 So assuming vmware is running in userspace and thus affected by priority
171 and user-groups, I'd definitely recommend setting up user-groups and
172 fiddling with its share, as well as that of the rest of the system, along
173 with the other steps above.
174
175 The one caveat with user-group scheduling, is that the
176 /sys/kernel/uids/<uid> directories are created and destroyed dynamically,
177 as apps run and terminate as those users. It's thus not (easily) possible
178 to set a static policy, whereby a particular UID /always/ gets a specific
179 non-default share. There was a writeup I read back when the feature was
180 first introduced, that was supposed to explain how to set up an automatic
181 handler such that every time a particular UID appeared, it'd write a
182 particular value to its cpu_share file, but as best I could tell, the
183 writeup was already out of date, as the scripts that were supposed to be
184 called automatically according to the writeup, were never called. The
185 kernel hotplugging (and/or udev, or whatever was handling it) had changed
186 out from under the documentation, even as the feature was going thru the
187 process of getting the peer approval necessary to be added to the mainline
188 kernel. So I've never had an automatic policy setup to do it. It could
189 certainly be done using a file-watch (fnotify/dnotify, etc, or polling of
190 some sort), without relying on the hotplugging mechanism that was supposed
191 to work that I could never get to work, but I've not bothered. I've
192 simply created scripts to echo the desired numbers into the desired files
193 when I invoke them, and run them manually when I need to. That has worked
194 well enough for my needs.
195
196 (Now you see why I'm not going into cgroups? user-group scheduling is
197 actually quite simple. Imagine the length of the post if I was trying to
198 explain cgroups!)
199
200 > The one place where I've been a bit disappointed is when the VGA
201 > drivers need to switch resolutions to play a game like Tux Racer then
202 > instead of two desktops I'm seeing one desktop duplicated on both
203 > monitors. Is this normal or is there some general way to control this?
204 > I'd really like the game on one monitor and just have the other stay
205 > black.
206
207 The problem here is that most resolution switchers simply assume a single
208 monitor. Before the X RandR extension, there was really no standard way
209 to reliably handle multiple monitor setups (xinerama, merged-framebuffer,
210 and proprietary or semi-proprietary methods like that used by the nvidia
211 and frglx drivers, were all in use at various times by various hardware/
212 drivers, and for all I know there were others as well), so assuming a
213 single monitor was pretty much the best they could do.
214
215 RandR has solved the standardization problem, but few games have upgraded
216 to it, in part because it's apparently "rocket science" to properly
217 program it. The xrandr CLI client works, but all too often, the X
218 environment tools are simply broken. KDE for example has had a tool
219 that's supposed to handle multiple monitors, changing resolutions, etc,
220 for some time, but on all three sets of hardware and drivers I've tried it
221 on, both the kde3 version and the kde4 version thru 4.3.4, it has screwed
222 things up royally if you're using more than one monitor. Only xorg's own
223 xrandr gets it right, and that's a CLI client, with a slew of options to
224 read about and try to master, before you can properly run it. I've
225 scripted a solution here using it, hard-coding some of the options I don't
226 change into the script (could be a config file) thus making my script
227 simple enough to run from the command line (or invoke from a menu entry)
228 without having to remember all the complicated syntax, but that's not
229 going to work for the CLI-phobic. And if the X environments can't get it
230 working correctly for many users even with the documentation and the
231 xrandr code to follow, what are the games folks supposed to do? So they
232 simply continue to assume only a single monitor... and screw things up for
233 those of us with more than one, at least if we prefer to run them in other
234 than clone mode.
235
236 Because that's what's happening. When the games, etc, trigger the old
237 single-monitor resolution change API, it causes xorg to switch to clone
238 mode, running all monitors at the same resolution, showing the same thing.
239
240 FWIW, the solution I've found, as I mentioned, is a script setup to invoke
241 my preferred resolutions, in my preferred non-clone modes, retaining my
242 preferred "stacked" monitor orientation, by invoking xrandr with the
243 appropriate parameters to do so.
244
245 Thus I use my script (which uses xrandr) to set the resolution I want, and
246 set the game not to change resolution -- to run in a window or whatever,
247 instead. I run kde, and with kwin's per-app and per-window config
248 options, I set it up to always put the windows for specific games at
249 specific locations, sometimes without window borders etc. Between that
250 and triggering the resolution settings I want with my xrandr script, I can
251 get the game running in a window, but that window set to exactly the size
252 and at exactly the location of the monitor I want it to run on, while the
253 other monitor continues at its configured size and showing the desktop or
254 apps it normally shows.
255
256 "Works for me!" =:^)
257
258 > More disturbing is when I exit the game I'm left with both desktops
259 > displaying the same things and neither is exactly my original first or
260 > second desktop but rather a combination of the two which is fairly
261 > strange. (Desktop #1 icons with Desktop #2 wallpaper)
262
263 Both desktops (monitors, I assume, that's quite different from virtual
264 desktops, which is how I'd normally use the term "desktop") displaying the
265 same thing is simply clone mode. You can try using your X environment
266 resolution tool (if xfce has such a thing, kde does and I think gnome
267 does) to switch back to what you want, but as I said, don't be surprised
268 if it doesn't work as expected, because they've really had problems
269 getting the things working right. xrandr gets it right, and you'd /think/
270 they could if /nothing/ else read its code and use similar tricks, it /is/
271 open source, after all, but kde certainly hasn't gotten it right, at least
272 not for many drivers and hardware, and from what I've read, gnome's
273 version isn't a lot better.
274
275 But, if you're up for a bit of reading, you can figure out how xrandr
276 works well enough to get it to do what you want. Here's an example,
277 actually the debug output of the script I run, showing the xrandr command
278 as it's setup and invoked by the script (all one command line):
279
280 xrandr --verbose --fb 1920x2400 --output DVI-0 --mode 1920x1200 --panning
281 1920x1200+0+0/1920x1200+0+0/20/20/20/20 --output DVI-1 --mode 1920x1200 --
282 panning 1920x1200+0+1200/1920x1200+0+1200/20/20/20/20
283
284 That results in:
285
286 1) an overall framebuffer resolution of 1920x2400
287
288 2) output DVI-0 being set to resolution 1920x1200, with its top-left
289 corner at position 0,0.
290
291 3) output DVI-1 being set to a similar resolution (I have two of the same
292 model of monitor, 1920x1200 native resolution), but with its top-left
293 corner at position 0,1200, thus, directly under DVI-0.
294
295 The panning mode stuff (except for the positioning bit) wouldn't be
296 necessary here as there's no panning to do, but those are the script
297 defaults. For use of panning mode, see this one:
298
299 xrandr --verbose --fb 1920x2400 --output DVI-0 --mode 1280x800 --panning
300 1920x1200+0+0/1920x1200+0+0/20/20/20/20 --output DVI-1 --mode 1280x800 --
301 panning 1920x1200+0+1200/1920x1200+0+1200/20/20/20/20
302
303 This keeps the same overall framebuffer size and output orientation
304 (stacked), but the outputs are both run at 1280x800, with the panning
305 domain set for each one such that as the mouse gets to 20 px from any edge
306 of the 1280x800 viewport, it moves the viewport within the corresponding
307 1920x1200 panning domain.
308
309 Here's one with different resolutions, and with panning when the mouse
310 reaches the edge itself (instead of 20 px in) on the lower resolution and
311 position one (DVI-1), so I can run a game there, without it trying to pan
312 out as near the edge. I then put the viewport over the game and let the
313 game grab the mouse, so I can then play the game without having to worry
314 about panning. If I need to, I can have it "ungrab" the mouse, and have
315 panning again on the lower one, or move to the "fixed" upper one and do
316 stuff there.
317
318 xrandr --verbose --fb 1920x2400 --output DVI-0 --mode 1920x1200 --panning
319 1920x1200+0+0/1920x1200+0+0/20/20/20/20 --output DVI-1 --mode 960x600
320 --panning 1920x1200+0+1200/1920x1200+0+1200/
321
322 When I'm finished with the game, or if I want to run normal resolution and
323 do something else for a bit, I just run that first command again, and it
324 returns me to normal mode as set by that first command.
325
326 Unfortunately, kde4 still has a few bugs with multiple monitors,
327 especially when switching resolutions. As mentioned, the kde4 resolution
328 switcher itself is entirely screwed up as all it can handle is clone mode
329 (there's no way to set separate non-identical top-left corners for each
330 monitor), but there's bugs with the plasma desktop as well. If I do
331 happen to select clone mode, or disable one of the monitors using xrandr,
332 upon return to normal mode, plasma-desktop is screwed up. I can fix it
333 without restarting X/kde, but it's a hassle to do so, and somewhat trial
334 and error, zooming in and out the various plasma "activities", until I get
335 it setup correctly once again. Hopefully, 4.4 has improved that as well.
336 I read it has. We'll see...
337
338 > I'm wondering if other environments handle this better. XFCE is
339 > pretty lightweight, which I like. I'd gone away from Gnome because of
340 > the time spent maintaining it on Gentoo but on this machine it probably
341 > wouldn't be all the bad. Not sure I want KDE but I'm curious as to
342 > whether anything solves this problem?
343
344 Well... kde 3 worked reasonably well in this regard (except its resolution
345 switcher wasn't much good either, I used X's ctrl-alt-numplus/numminus
346 zooming while it worked, then developed the xrandr scripts I still use
347 today when x switched to randr based switching and the numplus/numminus
348 zooming didn't work any more, but the desktop at least stayed put), but as
349 you can tell, I'm rather frustrated with kde4.
350
351 But definitely try xrandr. It's a pain to learn as it's all CLI options
352 not point and click, but it's remarkably good at doing what it does, once
353 you know how to run it, and possibly hack up a script or several to take
354 the complexity out of it.
355
356 > Logging out of XCFE and then running startx gets everything back
357 > the way I want, and I don't think I'll play Linux games much, but I'm
358 > curious as to how well other environments handle this.
359
360 As explained, the base problem is that games assume single monitor, which
361 X construes as a command to go into clone mode. The solution is to use an
362 external app (such as the xrandr invoking scripts I use) to set the
363 resolutions you want, and don't invoke the games' options to change
364 resolution or whatever, just have them run in a window. Then match the
365 window size to your desired resolution (enforcing it using your window
366 manager, if that's more convenient or necessary), and invoke the script
367 (or other external to the game resolution switcher app) changing the
368 resolution right before you run the game.
369
370 Alternatively, since we're talking about a script already, you could set
371 it up so the script runs xrandr to change the resolution as desired, then
372 runs the game, then when the game is done, changes the resolution back.
373
374 --
375 Duncan - List replies preferred. No HTML msgs.
376 "Every nonfree program has a lord, a master --
377 and if you use the program, he is your master." Richard Stallman

Replies

Subject Author
Re: [gentoo-amd64] Re: How well does your dual-head window manager handle games? Mark Knecht <markknecht@×××××.com>