1 |
Am Mi., 5. Jan. 2022 um 21:21 Uhr schrieb Sam James <sam@g.o>: |
2 |
> |
3 |
>> On 5 Jan 2022, at 19:18, Kai Krakow <kai@××××××××.de> wrote: |
4 |
> |
5 |
>>> Am Mi., 5. Jan. 2022 um 19:22 Uhr schrieb Ulrich Mueller <ulm@g.o>: |
6 |
> |
7 |
> [...] |
8 |
> |
9 |
>>> That applies to all parallel builds though, not only to ebuilds |
10 |
>>> inheriting check-reqs.eclass. By tweaking MAKEOPTS, we're basically |
11 |
>>> telling the user that the --jobs setting in their make.conf is wrong, |
12 |
>>> in the first place. |
13 |
> |
14 |
> |
15 |
>> Well, I'm using a safe combination of jobs and load-average, maybe the |
16 |
>> documentation should be tweaked instead. |
17 |
> |
18 |
> |
19 |
> I think "safe" is doing some heavy lifting here... |
20 |
|
21 |
Well, works "safe" for me at least, but you're right. |
22 |
|
23 |
>> I'm using |
24 |
>> [...] |
25 |
> |
26 |
> |
27 |
>> The "--jobs" parameter is mostly a safe-guard against "make" or |
28 |
>> "emerge" overshooting the system resources which would happen if |
29 |
>> running unconstrained without "--load-average". The latter parameter |
30 |
>> OTOH tunes the parallel building processes automatically to the |
31 |
>> available resources. If the system starves of memory, thus starts to |
32 |
>> swap, load will increase, and make will reduce the jobs. It works |
33 |
>> pretty well. |
34 |
> |
35 |
>> I've chosen the emerge loadavg limit slightly higher so a heavy ebuild |
36 |
>> won't starve emerge from running configure phases of parallel ebuilds. |
37 |
> |
38 |
> |
39 |
> ... because it's quite hard for this logic to work correctly enough |
40 |
> of the time without jobserver integration (https://bugs.gentoo.org/692576). |
41 |
|
42 |
Oh there's a bug report about this... I already wondered: Wouldn't it |
43 |
be better if it had a global jobserver? OTOH, there are so many build |
44 |
systems out there which parallelize building, and many of them won't |
45 |
use a make jobserver but roll their own solution. So it looks a bit |
46 |
futile on that side. That's why I've chosen the loadavg-based |
47 |
approach. |
48 |
|
49 |
> But indeed, I'd say you're not the target audience for this (but I appreciate |
50 |
> the input). |
51 |
|
52 |
Maybe not, I'm usually building in tmpfs (except huge source archives |
53 |
with huge build artifacts), that means, I usually have plenty of RAM, |
54 |
at least enough so it doesn't become the limiting factor. |
55 |
|
56 |
But then again, what is the target audience? This proposal looks like |
57 |
it tries to predict the future, and that's probably never going to |
58 |
work right. Looking at the Github issue linked initially in the |
59 |
thread, it looks like I /might/ be the target audience for packages |
60 |
like qtwebkit because I'm building in tmpfs. The loadavg limiter does |
61 |
quite well here unless a second huge ebuild becomes unpacked and built |
62 |
in the tmpfs, at which point the system struggles to keep up and |
63 |
starves from IO thrashing just to OOM portage a few moments later. |
64 |
That's of course not due to the build jobs itself then, it's purely a |
65 |
memory limitation. But for that reason I have configuration to build |
66 |
such packages outside of tmpfs: While they usually work fine when |
67 |
building just that package alone, it fails the very moment two of such |
68 |
packages are built in parallel. |
69 |
|
70 |
Maybe portage needs a job server that dynamically bumps the job |
71 |
counter up or down based on current memory usage? Or "make" itself |
72 |
could be patched to take that into account? But that's probably the |
73 |
whole idea of the loadavg limiter. So I'd propose to at least mention |
74 |
that in the documentation and examples, it seems to only be little |
75 |
known. |
76 |
|
77 |
Then again, if we run in a memory constrained system, it may be better |
78 |
to parallelize ebuilds instead of build jobs to better make use of |
79 |
combining light and heavy ebuild phases into the same time period. |
80 |
|
81 |
Also, I'm not sure if 2 GB per job is the full picture - no matter if |
82 |
that number is correct or isn't... Because usually the link phase of |
83 |
packages like Chrome is the real RAM burner even with sane "jobs" |
84 |
parameters. I've seen people failing to install these packages because |
85 |
they didn't turn on swap, and then during the link phase, the compiler |
86 |
took so much memory that it either froze the system for half an hour, |
87 |
or OOMed. And at that stage, there's usually just this single compiler |
88 |
process running (and maybe some small ones which almost use no memory |
89 |
relative to that). And that doesn't get better with modern compilers |
90 |
doing all sorts of global optimization stuff like LTO. |
91 |
|
92 |
So maybe something like this could work (excluding the link phase): |
93 |
|
94 |
If there's potentially running just one ebuild at a time (i.e. your |
95 |
merge list has just one package), the effects of MAKEOPTS is quite |
96 |
predictable. But if we potentially run more, we could carefully reduce |
97 |
the number of jobs in MAKEOPTS before applying additional RAM |
98 |
heuristics. And those heuristics probably should take the combination |
99 |
of both emerge jobs and make jobs into account because potentially |
100 |
that multiplies (unless 692576 is implemented). |
101 |
|
102 |
Compiler and link flags may also be needed to take into account. |
103 |
|
104 |
And maybe portage should take care of optionally serializing huge |
105 |
packages and never build/unpack them at the same time. This would be a |
106 |
huge winner for me so I would not have to manually configure things... |
107 |
Something like PORTAGE_SERIALIZE_CONSTRAINED="1" to build at most one |
108 |
package that has some RAM/storage warning vars in the ebuild. But |
109 |
that's probably a different topic as it doesn't exactly target the |
110 |
problem discussed here - and I'm also aware of this problem unlike the |
111 |
target audience. |
112 |
|
113 |
|
114 |
Regards, |
115 |
Kai |