1 |
On Fri, Dec 03, 2004 at 10:02:19AM -0900, Leif Sawyer wrote: |
2 |
|
3 |
> make -j all |
4 |
> |
5 |
> uptime: 09:41:41 up 10m, 2 users, load avg: 200.90, 65.79, 23.42 |
6 |
... |
7 |
> Dec 3 09:48:05 VM: killing process apache2 |
8 |
|
9 |
> OOMkiller 'feels' more aggressive in -rc2, but still doesn't have the |
10 |
> instability |
11 |
> that gds-267-r16 has |
12 |
|
13 |
Your workload is pathological. 100+ simultaneous executions of gcc is |
14 |
sure to exhaust resources on all but the largest boxes. That the |
15 |
silly "oom killer" kills the wrong process and that its behaviour |
16 |
varies from kernel to kernel should not be surprising as it has been |
17 |
discussed to death on lkml. Don't use it. If you need to keep system |
18 |
processes running, disable the oom killer and apply resource limits to |
19 |
users. |
20 |
|
21 |
I question the use of make -j as a stress test. What is the desired |
22 |
behaviour of running this command, other than not to crash the OS? If |
23 |
you haven't specified any resource allocation policy, the OS's only |
24 |
real obligation is not to crash. So what constitutes a "successful" |
25 |
test run? |
26 |
|
27 |
Should the spawned compiler instances fail to allocate memory and |
28 |
bomb? Should the OS kill them? Should it kill the pg leader (make) |
29 |
instead? Or your shell? Or should some other process's allocations |
30 |
fail? Should the OS kill some other process? If so, which one? If |
31 |
it's going to kill something, what signal(s) should it force? If it |
32 |
sends SIGTERM, what should happen to other processes that attempt to |
33 |
sbrk(2) or mmap(2) while it's waiting for the SIGTERM'd process to |
34 |
die? Do you put them to sleep? Do you fail their requests? Do you |
35 |
kill them too (yay, deadlock!)? What if the process needs to allocate |
36 |
memory when dying (more deadlock)? If it sends SIGKILL, what about |
37 |
shm segments it may have allocated; wouldn't leaking those just worsen |
38 |
the problem? Maybe it should just kill all of userland except init |
39 |
and start over. And if it's going to do that, why not just crash? |
40 |
|
41 |
I don't see that Linux has answered any of these questions, and most |
42 |
of them don't need to be asked. Use resource management. If it |
43 |
doesn't work, fix it. The tests you are running are a waste of time. |
44 |
|
45 |
-- |
46 |
Keith M Wesolowski |
47 |
"Site launched. Many things not yet working." --Hector Urtubia |
48 |
|
49 |
-- |
50 |
gentoo-sparc@g.o mailing list |