1 |
On 18/09/2014 16:48, James wrote: |
2 |
> Hello, |
3 |
> |
4 |
> Out Of Memory seems to invoke mysterious processes that kill |
5 |
> such offending processes. OOM seems to be a common problem |
6 |
> that pops up over and over again within the clustering communities. |
7 |
> |
8 |
> |
9 |
> I would greatly appreciate (gentoo) illuminations on the OOM issues; |
10 |
> both historically and for folks using/testing systemd. Not a flame_a_thon, |
11 |
> just some technical information, as I need to understand these |
12 |
> issues more deeply, how to find, measure and configure around OOM issues, |
13 |
> in my quest for gentoo clustering. |
14 |
|
15 |
The need for the OOM killer stems from the fact that memory can be |
16 |
overcommitted. These articles may prove informative: |
17 |
|
18 |
http://lwn.net/Articles/317814/ |
19 |
http://www.oracle.com/technetwork/articles/servers-storage-dev/oom-killer-1911807.html |
20 |
|
21 |
In my case, the most likely trigger - as rare as it is - would be a |
22 |
runaway process that consumes more than its fair share of RAM. |
23 |
Therefore, I make a point of adjusting the score of production-critical |
24 |
applications to ensure that they are less likely to be culled. |
25 |
|
26 |
If your cases are not pathological, you could increase the amount of |
27 |
memory, be it by additional RAM or additional swap [1]. Alternatively, |
28 |
if you are able to precisely control the way in which memory is |
29 |
allocated and can guarantee that it will not be exhausted, you may elect |
30 |
to disable overcommit, though I would not recommend it. |
31 |
|
32 |
With NUMA, things may be more complicated because there is the potential |
33 |
for a particular memory node to be exhausted, unless memory interleaving |
34 |
is employed. Indeed, I make a point of using interleaving for MySQL, |
35 |
having gotten the idea from the Twitter fork. |
36 |
|
37 |
Finally, make sure you are using at least Linux 3.12, because some |
38 |
improvements have been made there [2]. |
39 |
|
40 |
--Kerin |
41 |
|
42 |
[1] At a pinch, additional swap may be allocated as a file |
43 |
[2] https://lwn.net/Articles/562211/#oom |